Google simply launched Gemini 2.5 Flash Picture, which is nicknamed “Nano Banana,” and a few are calling it probably the most superior AI picture editor accessible in the present day.
Constructed instantly into the Gemini app and accessible through the Gemini API and Google AI Studio, Nano Banana isn’t simply one other picture software. It provides a seamless mixture of picture technology, modifying, and understanding, all accessible by way of pure language prompts.
I unpacked what the software can do with SmarterX and Advertising and marketing AI Institute founder and CEO Paul Roetzer on Episode 165 of The Artificial Intelligence Show.
An Picture Editor That Understands the Actual World
Nano Banana permits customers to:
- Preserve character consistency throughout a number of pictures
- Carry out complicated native edits (like eradicating a stain or altering a pose)
- Fuse a number of pictures right into a photorealistic scene
- Recolor or restyle images with a sentence
- Perceive diagrams and floor edits in world data
Whereas different instruments could be nice at aesthetics, Nano Banana goes additional. It understands context. Meaning your canine gained’t instantly change breeds mid-edit, and your face stays your face, even in the event you swap the background from a kitchen to the floor of Mars.
That’s drawing loads of consideration on-line, says Roetzer, as customers are discovering the software to be wonderful at performing detailed edits utilizing simply pure language prompts.
Immediate-Based mostly Modifying Meets Multimodal Intelligence
What makes Nano Banana so disruptive is how pure it feels to make use of. You possibly can say issues like, “Put me on a mountaintop at sundown” or “Take away the particular person within the background” and even “Flip this drawing right into a labeled diagram,” and it simply works.
That is all made doable by the mannequin’s native world data and multimodal coaching. They open the door to a variety of latest use circumstances, from model asset creation to interactive instructional instruments.
And all pictures include Google’s SynthID invisible watermark, so AI edits stay traceable.
So, Is This the Finish of Imagen?
One of many first questions we had in regards to the functionality: Is that this simply Google’s Imagen 4 image generation model beneath a brand new identify?
The reply: Not fairly, at the least based on the solutions we bought by asking Google Gemini.
In accordance with Gemini, Imagen 4 remains to be round, but it surely performs a special function. It is a specialised diffusion mannequin designed purely for photorealistic picture technology from textual content prompts. Nano Banana, alternatively, is a local multimodal mannequin that understands each pictures and textual content. When it must generate a picture from scratch, it could possibly name upon Imagen 4 as an underlying engine.
Consider Nano Banana because the director. Imagen 4 is the cinematographer referred to as in when wanted.
Wish to Attempt It? Simply Search for the Banana
Google even embraced their playful aspect with this launch. Within the Gemini app, picture modifying is now symbolized by a banana emoji, a nod to the Nano Banana nickname.
It is a small contact, but it surely alerts that Google is not afraid to have enjoyable with its AI releases.
Wish to discover what it could possibly do? Ask Gemini: “Give me a immediate to check the complete capabilities of two.5 Flash Picture.” You’ll get wealthy, detailed prompts to kickstart your experimentation. Or add a picture and ask for solutions.