…so Nano Banana Pro was released yesterday, as we’re sure you are aware.
The AI community has already created an insane amount of generations with this model. Yes, it can handle the basics of any image model: style transfer, object removal, text rendering, realistic images. But these are the least shocking of its capabilities.
In this post, we really wanted to highlight some of the crazy images the AI community has been able to extract from Nano Banana Pro. Strap in.
Logic
One of the most impressive facets of Nano Banana Pro is its baked-in logic. Typically, image models are good at constructing new photos from the spatial information found in the input image. However, no image model has been able to deduce, interpret, and answer textual information found in a prior image. The closest might have been GPT-image-1 which came out earlier this year, due to its autoregressive architecture. With Nano Banana Pro, we can clearly note there are intermediary prompting layers that help the model make logical conclusions. These layers seem to act as a reasoning bridge between the input image and the final output, allowing the model to not just see text in an image, but actually understand and respond to it contextually.
(Side note: someone might have discovered how to uncover said system prompt.)
For instance, you can feed Nano Banana your homework and get correct answers with work shown.
write the answers to the questions in pencil. show your work
We loved seeing creators take long pieces of information, like papers or websites, and creating summary images from them.
Nano Banana Pro is wild.
Here’s my favorite use case so far: take papers or really long articles and turn them into a detailed whiteboard photo.
A picture is worth a thousand words. We could definitely see this model being a powerful tool for educators to create visuals. Check out these infographics:
This also means Nano Banana Pro is really good at rendering code. Other image models seemed to hallucinate with this task, but given how this model is entangled in the larger Gemini 3 Pro language model, code interpretability is much better than other SOTA models.
Check out how the model was able to render this ship written in React and WebGL shader code (click on the image to see the entire code snippet):
render this: /** * @license * SPDX-License-Identifier: Apache-2.0 */ import React, useRef, useEffect from ‘react’; import useAppContext from ’../context/AppContext’; // A simple full-screen quad vertex shader const VERTEX_SHADER = #version 300 es in vec2 a_position; out vec2 v_uv; void main() v_uv = a_position; gl_Position = vec4(a_position, 0.0, 1.0); const FRAGMENT_SHADER = version 300 es precision highp float; in vec2 v_uv; out vec4 outColor…
Text and Design
This model arguably has some of the best text adherence — you should be able to input any piece of text and Nano Banana Pro will get it word for word. Look at what fofr has to say:
To say Nano Banana Pro is good at text is an understatement. Here's the Gemini 3 blog post, as a glossy magazine article.
> Put this whole text, verbatim, into a photo of a glossy magazine article on a desk, with photos, beautiful typography design, pull quotes and brave… pic.twitter.com/NVm1r4UEHY
— fofr (@fofrAI) November 20, 2025
What’s really cool is that text adherence is still maintained even when you are trying out various styles or designs. Nano Banana Pro does not sacrifice one or the other. This means we can take an infographic with a lot of dense information and actually apply some interesting styles to them.
Take a look at these Machine Learning posters we created, for instance:
We’ve also been seeing some cool design spreads that might have taken folks hours to lay out using other tools. We see this as a great model for designers to rapidly design and iterate on mockups and potentially create assets that could be used in production.
Cover magazine editors are done.
Let me show you how good Nano Banana Pro is at creating magazine covers. I intentionally used Indonesian magazine prompts to see whether the model would misspell non-English text.
Bottom line? With Nano Banana Pro, you no longer have to compromise between accurate text rendering and creative design freedom. The model maintains pixel-perfect text accuracy while fully embracing your stylistic input.
Characters
One of the standout features of Nano Banana Pro is its ability to handle character consistency across multiple reference images. The model can process up to 14 reference images simultaneously, allowing you to maintain consistent character appearances, poses, and styles across different scenes and contexts. This makes it incredibly powerful for storytelling, brand consistency, and creating cohesive visual narratives where the same characters need to appear in various situations.
Multiple characters also means that we can input multiple objects to synthesize a new image. This is great for virtual try-on (a little too great):
— Sebastien Jefferies (@SebJefferies) November 20, 2025
New record? 25 items combined into one image using the collage method with Nano Banana Pro. The previous record for Nano Banana 1 that I tried was 13 items from a collage. Overall accuracy of items is still much better with less items but I wanted to push it. I'm sure It can go… pic.twitter.com/Ob9C8QcVA9
— Travis Davids (@MrDavids1) November 20, 2025
We can even take characters and put them out of context (this was one of our favorite effects).
Make him [insert scenario here]. Keep his whiteboard style, but make the surroundings realistic.
Yup, that means character consistency is a given with this model. Even with minimal prompting, characters are able to persist after multiple iterative generations. For example, take a look at Google’s example of exploring camera zooms with the same character:
Change aspect ratio to 1:1 by reducing background. The character remains exactly locked in its current position.
Source: Google
We see a world where Nano Banana Pro can be used for tasks like storyboarding where character consistency is important, regardless of the environments characters are placed in. The model is clearly able to persist the features of characters as we put them through the model wringer.
World Knowledge (experiments)
Is Nano Banana Pro actually connected to the internet?
We tried to test making comics of the weather in different places around the world.
These were good guesses, but the model wasn’t able to get the exact weather of these places at the time of our generations. You’ll need to integrate Search as a tool to provide real-time information to Nano Banana (structured tool calling coming soon!).
We thought we might be able to get real-time snapshots of places (for example, a morning shot of San Francisco and at the same time, a night shot in New Delhi). Interestingly, both of these images were marked with the same date in 2023 🤷♂️
However, the amount of real-world knowledge baked into the model itself is quite impressive. Just look at how it is able to deduce world landmarks just given their coordinates.
The amount of world knowledge in Nano Banana Pro could make for a multitude of apps. Just some simple prompting layers and/or tools integrated can make this model even more insanely capable than it already is.
Getting Started with the API
Okay, we just bombarded you with a ton of cool Nano Banana Pro examples. What’s even cooler is that you can use this model in your own apps.
Here’s how to call Nano Banana Pro using JavaScript and the Replicate API:
Prompt: "Create a grid of 4 editorial fashion images focused on [Nike], [2 x macro, 2 x dynamic action] that follow the same style and colour palette as [@]img1. Make each of the new shots unique."
Pretty cool. Ariel Noyman at MIT Media Lab adapted my Nano Banana Pro prompts to map "urban informality" -- complex settlements that are notoriously difficult to map manually.
Nano Banana Pro / Gemini 3 Pro Image is crazy. It turned this blueprint into a realistic 3D image. It did not just create the image, it first read the blueprint properly and then created the final output with every small detail.