Blog / Prompt tips / How to Prompt Nano Banana (Gemini 3 Pro…

How to Prompt Nano Banana (Gemini 3 Pro Image): The Few Patterns That Actually Matter

A practical way to prompt Nano Banana for image generation and editing-without cargo-culting camera specs or fighting the model.

Ilia Ilinskii
Rephrase · Feb 07, 2026

Prompt tips10 min

On this page

First: what "Nano Banana" is (in practice)The Nano Banana prompt stack I keep reusing 1) Task: say what operation you want 2) Subject lock: tell it what must stay the same 3) Transform: list changes as specific deltas 4) Constraints: end with a "do not" list Practical prompt templates (copy/paste)Template A: restoration (don't change the person)Template B: style transfer while keeping layout Template C: product shot (anti-warping)Iteration: the part most people skip My take on "cinematic prompts" and camera jargon Closing thought: prompt it like you're doing QA, not art direction References

Nano Banana prompts have a weird reputation.

People talk about it like it's either magic ("it just understands") or impossible ("it ignores my reference image and does whatever"). What's interesting is that both are true, depending on whether your prompt gives the model something it can control.

Here's my working mental model: Nano Banana is less impressed by adjectives and more responsive to constraints and visual anchors-the stuff that narrows the space of valid images. That lines up with research showing that providing the right kind of context can act as a control signal, and that increasing "inference budget" (or just iterating properly) changes outcomes dramatically in visual tasks [1]. It also lines up with a broader prompt-engineering lesson: you get reliability when you specify goals, context, and an output contract-then test and iterate instead of praying [2].

So in this post, I'm going to show you how I prompt Nano Banana in a way that's consistent, debuggable, and easy to reuse.

First: what "Nano Banana" is (in practice)

Nano Banana is the nickname people use for Gemini 3 Pro Image. In a recent visual reasoning paper, the authors explicitly refer to "Gemini 3 pro image (nano banana pro)" and evaluate it in an image-editing setting for a tangram-like task, where geometric consistency matters a lot [1]. That matters because it hints at what Nano Banana is good at: taking visual context and transforming it while trying to preserve constraints.

The catch is that image models don't "follow instructions" the way text models do. They follow signals. Your job is to make those signals unambiguous.

In other words: prompt it like you're writing a spec, not a poem.

The Nano Banana prompt stack I keep reusing

I use a consistent structure. Not because the model "needs JSON" (it doesn't), but because I need prompts I can debug.

I split prompts into four parts: Task, Subject lock, Transform, and Constraints.

1) Task: say what operation you want

Nano Banana does better when the first sentence is basically a verb.

Generate, edit, restore, re-style, extend, remove, replace. Pick one.

This is the equivalent of "goal specification" in prompt design: tell the model what success is before you drown it in details [3].

2) Subject lock: tell it what must stay the same

A lot of "Nano Banana changed my person into a new person" complaints come from missing identity constraints.

There's a Reddit post where someone tries to restore an old photo of their mom and the model "creates a new person" [4]. That's not the model being evil. That's the model doing what generative models do when identity isn't constrained: it hallucinates a plausible face under the umbrella of "improve clarity."

So I explicitly lock what can't change: identity features, pose, composition, logos, text, whatever matters.

3) Transform: list changes as specific deltas

Instead of "make it better," I specify deltas: remove creases, correct color cast, increase sharpness, reduce noise, preserve grain, keep lighting direction.

This is also where you should prefer measurable-ish or concrete terms over vibes. A popular community write-up on Nano Banana prompting makes the same point: quantified parameters and professional terminology tend to outperform vague adjectives [5]. I'd treat the examples there as inspiration, not gospel, but the principle is solid.

4) Constraints: end with a "do not" list

Negative constraints are not optional. They're often the difference between "nice" and "why did you add text and extra fingers."

Also, constraints make prompts more reproducible under paraphrase and minor perturbations-one of the core ideas in robust prompt evaluation [3].

Practical prompt templates (copy/paste)

These are written for chat-style usage where you upload an image and prompt Nano Banana to edit it.

Template A: restoration (don't change the person)

Task: Restore and enhance this photo.

Subject lock:
- Preserve the same person/identity. Do not change facial structure, age, ethnicity, hairstyle, or expression.
- Preserve the original pose, framing, and camera angle.

Transform:
- Remove creases, scratches, dust, and stains.
- Improve clarity and focus gently (no over-sharpening).
- Reduce noise while keeping natural film grain.
- Correct faded colors to look natural for the era.

Constraints:
- Do not invent new background elements.
- Do not add or remove jewelry, moles, scars, or distinctive marks.
- No beauty retouching, no "AI face".
- No text, no watermark.

If you're restoring a family photo, that "Subject lock" section is the whole game.

Template B: style transfer while keeping layout

Task: Re-style this image into a new art direction.

Subject lock:
- Keep the exact composition and layout.
- Keep the same subject shapes and proportions.

New style:
- Art direction: [e.g., 90s editorial flash photography / Swiss poster design / watercolor children's book]
- Color palette: [3-6 colors]
- Lighting: [soft daylight / hard flash / tungsten]
- Texture: [paper grain / film grain / clean vector]

Constraints:
- No extra objects.
- No text or logos.
- No distorted faces or warped hands.

Notice what's missing: a 20-line list of camera specs. You can add those, but they're seasoning.

Template C: product shot (anti-warping)

Task: Create a clean product photo using this item as reference.

Subject lock:
- Preserve the product design exactly (shape, buttons, logo placement, proportions).

Scene:
- Background: seamless [white/black/gradient]
- Surface: [acrylic / matte stone / wood]
- Lighting: soft studio, controlled reflections

Constraints:
- Do not redesign the product.
- No warped geometry.
- No extra text, no watermark, no fake brand marks.

Iteration: the part most people skip

One big reason prompting feels random is that people do one shot, then rewrite the whole prompt from scratch.

I prefer tight iteration loops: keep the prompt mostly stable, change one variable, and use the model output as evidence.

This is basically "test prompts over representative inputs, measure tail risk, and make prompts degrade gracefully" translated into normal-person workflow [3].

When Nano Banana misses, I ask: did it fail at subject lock, transform, or constraints?

Then I patch that section only.

My take on "cinematic prompts" and camera jargon

Yes, adding photography terms can help. The community post that analyzed viral Nano Banana prompts claims specifics like lens length and film stock outperform vague adjectives [5]. I've seen that pattern too.

But here's the thing: if you're using Nano Banana for editing or identity-sensitive work, camera jargon won't save you. Constraints will.

So I treat camera jargon as optional flavor that comes after the hard requirements are nailed down.

Closing thought: prompt it like you're doing QA, not art direction

Nano Banana is strongest when you give it a stable target and tight bounds. Visual context is a control signal [1]. Output reliability comes from clear goals and contracts, plus iteration [3].

If you try one thing this week, try this: take your current Nano Banana prompt and add a Subject lock section with three "must not change" bullets. For most image editing workflows, that single change will feel like you upgraded models.

References

Documentation & Research
1. Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning - arXiv (cs.LG) - https://arxiv.org/abs/2601.21037
2. Quantum Circuit Generation via test-time learning with large language models - arXiv / The Prompt Report - http://arxiv.org/abs/2602.03466v1
3. Models Know Models Best: Evaluation via Model-Preferred Formats - arXiv (cs.CL) - https://arxiv.org/abs/2601.22699
Community Examples
1. Help me restore a childhood image of my mom - r/PromptEngineering - https://www.reddit.com/r/PromptEngineering/comments/1qsnlgz/help_me_restore_a_childhood_image_of_my_mom/
2. After analyzing 1,000+ viral prompts, I made a system prompt that auto-generates pro-level NanoBanana prompts - r/PromptEngineering - https://www.reddit.com/r/PromptEngineering/comments/1qq4tet/after_analyzing_1000_viral_prompts_i_made_a/