Nano Banana - AI Image Generation

Generate and edit images using Google Gemini models. Supports two models:

Pro (gemini-3-pro-image-preview ) — High quality, complex prompts, thinking mode
Flash (gemini-2.5-flash-image ) — Fast, cheap, good for iteration

Prerequisites

Required:

GEMINI_API_KEY — Get from Google AI Studio
uv (recommended) or Python 3.10+ with google-genai installed

With uv (recommended — zero setup): Dependencies are declared inline via PEP 723 and auto-installed on first run. Just use uv run instead of python3 .

With pip (fallback):

pip install -r ${CLAUDE_SKILL_DIR}/requirements.txt

Quick Start

Default output: Images save to ~/Downloads/nanobanana_<timestamp>.png automatically. Do NOT pass -o unless the user specifies where to save. If the user provides a filename without a directory (e.g., "save it as robot.png"), use -o ~/Downloads/robot.png .

Generate an image:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "a cute robot mascot, pixel art style"

Edit an existing image:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "make the background blue" -i input.jpg

Use Flash model for fast iteration:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "quick sketch of a cat" --model flash

Multi-image reference (style + subject):

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "apply the style of the first image to the second"
-i style_ref.png subject.jpg

Generate with specific aspect ratio and resolution:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "cinematic landscape" --ratio 21:9 --size 4K

Save to a specific location:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "logo design" -o ~/Projects/brand/logo.png

Model Selection Guide

Pro (default) Flash

Speed Slower ~2-3x faster

Cost Higher Lower

Text rendering Good Unreliable

Complex scenes Excellent Adequate

Thinking mode Yes No

Best for Final production images Exploration, drafts, batch

Rule of thumb: Use Flash for exploration and batch generation, Pro for final output.

Script Reference

scripts/generate.py

Main image generation script.

Usage: generate.py [OPTIONS] PROMPT

Arguments: PROMPT Text prompt for image generation

Options: -o, --output PATH Output file path (default: ~/Downloads/nanobanana_<timestamp>.png) -i, --input PATH... Input image(s) for editing / reference (up to 14) -m, --model MODEL Model: 'pro' (default), 'flash', or full model ID -r, --ratio RATIO Aspect ratio (1:1, 16:9, 9:16, 21:9, etc.) -s, --size SIZE Image size: 1K, 2K, or 4K (default: standard) --search Enable Google Search grounding for accuracy --retries N Max retries on rate limit (default: 3) -v, --verbose Show detailed output

Supported aspect ratios:

1:1 — Square (default)
2:3 , 3:2 — Portrait/Landscape
3:4 , 4:3 — Standard
4:5 , 5:4 — Photo
9:16 , 16:9 — Widescreen
21:9 — Ultra-wide/Cinematic

Image sizes:

1K — Fast, lower detail
2K — Enhanced detail (2048px)
4K — Maximum quality (3840px), best for text rendering

scripts/batch_generate.py

Generate multiple images with sequential naming.

Usage: batch_generate.py [OPTIONS] PROMPT

Arguments: PROMPT Text prompt for image generation

Options: -n, --count N Number of images to generate (default: 10) -d, --dir PATH Output directory (default: ~/Downloads) -p, --prefix STR Filename prefix (default: "image") -m, --model MODEL Model: 'pro' (default), 'flash', or full model ID -r, --ratio RATIO Aspect ratio -s, --size SIZE Image size (1K/2K/4K) --search Enable Google Search grounding --retries N Max retries per image on rate limit (default: 3) --delay SECONDS Delay between generations (default: 3) --parallel N Concurrent requests (default: 1, max recommended: 5) -q, --quiet Suppress progress output

Example:

uv run ${CLAUDE_SKILL_DIR}/scripts/batch_generate.py "pixel art logo" -n 20 --model flash -d ./logos -p logo

Python API

Direct import (from another skill's script):

Note: When importing as a Python module, google-genai must be available in the calling script's environment. If using uv run , add a PEP 723 dependencies block to your own script (see example in Pattern 2 below).

import sys from pathlib import Path sys.path.insert(0, str(Path("${CLAUDE_SKILL_DIR}/scripts"))) from generate import generate_image, edit_image, batch_generate

Generate image

result = generate_image( prompt="a futuristic city at night", output_path="city.png", aspect_ratio="16:9", image_size="4K", model="pro", )

Edit existing image

result = edit_image( prompt="add flying cars to the sky", input_path="city.png", output_path="city_edited.png", )

Multi-image reference

result = generate_image( prompt="combine the color palette of the first with the composition of the second", input_paths=["palette_ref.png", "composition_ref.png"], output_path="combined.png", )

Return structure (always present):

{ "success": True, # or False "path": "/path/to/output.png", # or None on failure "error": None, # or error message string "metadata": { "model": "gemini-3-pro-image-preview", "prompt": "...", "aspect_ratio": "16:9", "image_size": "4K", "use_search": False, "input_images": None, # or list of paths "text_response": "...", # optional text from model "thinking": "...", # Pro model reasoning (when available) "timestamp": "2025-01-26T...", } }

Downstream Skill Integration Guide

Pattern 1: CLI wrapper (recommended for simple use)

In your skill's script:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "{prompt}" --model flash --ratio 16:9 -o output.png

Pattern 2: Python import with custom defaults

/// script

requires-python = ">=3.10"

dependencies = [

"google-genai>=1.0.0",

]

///

import sys from pathlib import Path

NANOBANANA_DIR = Path("${CLAUDE_SKILL_DIR}/scripts") sys.path.insert(0, str(NANOBANANA_DIR)) from generate import generate_image

def generate_thumbnail(prompt: str, output_path: str) -> dict: """Generate a YouTube thumbnail with project defaults.""" return generate_image( prompt=prompt, output_path=output_path, aspect_ratio="16:9", image_size="2K", model="flash", max_retries=3, )

Pattern 3: Batch with progress tracking

from batch_generate import batch_generate

def on_progress(completed, total, result): print(f"Progress: {completed}/{total}")

results = batch_generate( prompt="logo concept", count=20, output_dir="./logos", prefix="logo", model="flash", aspect_ratio="1:1", on_progress=on_progress, )

successful = [r for r in results if r["success"]]

Pattern 4: Sequential generation for series

When a downstream skill needs multiple consistently-styled images (e.g., newsletter visuals, thumbnail A/B variants), use the anchor-and-reference pattern:

from generate import generate_image

Step 1: Generate the style anchor

anchor = generate_image( prompt="warm illustration style, earth tones, soft gradients, clean lines", output_path="anchor.png", model="pro", )

Step 2: Generate each image in the series, referencing the anchor

subjects = ["laptop on desk with coffee", "person reading a book", "sunrise over mountains"] series_paths = [anchor["path"]]

for i, subject in enumerate(subjects): result = generate_image( prompt=f"{subject}, matching the visual style and color palette of the reference image exactly", input_paths=[anchor["path"]], # always include the anchor output_path=f"series_{i+1:02d}.png", model="pro", ) if result["success"]: series_paths.append(result["path"])

The full sequential generation patterns are documented in the Sequential Generation section below.

Environment Variables

Variable Description Default

GEMINI_API_KEY

Google Gemini API key Required

IMAGE_OUTPUT_DIR

Default output directory ~/Downloads

Features

Text-to-Image Generation

Create images from text descriptions. Both models excel at:

Photorealistic images
Artistic styles (pixel art, illustration, etc.)
Product photography
Landscapes and scenes

Image Editing

Transform existing images with natural language:

Style transfer
Object addition/removal
Background changes
Color adjustments

Multi-Image Reference

Provide up to 14 reference images for:

Style consistency across a series
Subject consistency (same character, different poses)
Brand-consistent generation
Style + subject combination

High-Resolution Output

1K — Fast generation, good for drafts
2K — Enhanced detail (2048px)
4K — Maximum quality (3840px), best for text rendering

Google Search Grounding

Enable --search for factually accurate images involving:

Real people, places, landmarks
Current events
Specific products or brands

Automatic Retry

Rate limit errors are automatically retried with exponential backoff (default: 3 retries). No action needed from callers.

SynthID Watermark Notice

All images generated by Gemini contain an invisible SynthID digital watermark. This is automatic, cannot be disabled, and survives common transformations (resize, crop, compression). Be aware of this for any use case requiring watermark-free output.

Sequential Generation

Use sequential generation to maintain visual consistency across a series of images. The core technique: generate an anchor image first, then pass it as a reference (-i ) for every subsequent image in the series.

Pattern 1: Style-Board Anchoring

Generate a single anchor image that establishes the visual identity for a series. Reference it for all subsequent images.

When to use: Newsletter visual series, A/B thumbnail variants, brand-consistent image batches.

Workflow:

Generate the anchor image with a prompt emphasizing style, palette, and mood:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"modern flat illustration style, warm earth tones, soft gradients, clean lines,
minimal detail, cozy atmosphere"
--model pro -o anchor.png

Generate each subsequent image referencing the anchor:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"a laptop on a desk with coffee, matching the visual style, color palette,
and lighting of the reference image exactly"
-i anchor.png --model pro -o image_01.png

Repeat step 2 for each image in the series, always referencing the same anchor.

Tip: Use Flash to draft the anchor quickly, then regenerate with Pro once you find a style you like.

Pattern 2: Subject Consistency

Keep the same character or subject looking consistent across different scenes and poses.

When to use: Mascot in multiple contexts, product photography series, recurring character.

Workflow:

Generate the initial subject with clear, detailed appearance description:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"a friendly robot mascot with round blue body, orange antenna, large expressive eyes,
simple geometric design, standing front-facing on white background"
--model pro -o subject_front.png

Generate new scenes referencing the subject:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"the same robot character from the reference image, now sitting at a desk typing,
same proportions and colors, office background"
-i subject_front.png --model pro -o subject_office.png

For stronger consistency, reference 2-3 of the best previous outputs:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"the same robot character from the reference images, now outdoors in a park,
same proportions and colors, waving at the viewer"
-i subject_front.png subject_office.png --model pro -o subject_park.png

Pattern 3: Progressive Accumulation

Build a reference pool over a long series, adding each successful output as a reference for the next.

When to use: Series of 5+ images where consistency must compound across the full set.

Workflow:

Generate the anchor (same as Pattern 1, step 1).
Generate image 2 referencing the anchor.
Generate image 3 referencing anchor + image 2.
Continue, keeping the 3-4 strongest references in the -i list. Drop weaker outputs.

Why cap at 3-4 references: More references dilute the style signal. The model averages across all inputs — too many and the result loses coherence. Keep only the images that best represent the target style.

Reference ordering matters: Place the style anchor first in the -i list. The model weights earlier references slightly more.

Best Practices

Prompt Writing

Good prompts include:

Subject description
Style/aesthetic
Lighting and mood
Composition details
Color palette

See references/prompts.md for detailed prompt templates by category and model-specific tips.

Batch Generation Tips

Use --model flash for exploration batches (faster, cheaper)
Generate 10-20 variations to explore options
Default 3-second delay between sequential requests avoids rate limits
Review results and iterate on best candidates with Pro model

Rate Limits

Gemini API has usage quotas (~10 RPM free tier)
Automatic retry with exponential backoff handles transient rate limits
For large batches, use --delay 5 or --parallel with modest concurrency
Check your quota at Google AI Studio

Troubleshooting

"uv: command not found"

Install uv: curl -LsSf https://astral.sh/uv/install.sh | sh or brew install uv

"Error: google-genai package not installed"

Use uv run instead of python3 to auto-install dependencies
Or install manually: pip install -r ${CLAUDE_SKILL_DIR}/requirements.txt

"GEMINI_API_KEY environment variable not set"

Set GEMINI_API_KEY in your environment before running

"No image in response"

Prompt may have triggered safety filters
Try rephrasing to avoid sensitive content

"Rate limit exceeded after N retries"

Wait 30-60 seconds and try again
Reduce batch parallelism or add longer delays
Check your API quota

Import errors in batch_generate.py

The script handles its own path setup; run from any directory

Future Capabilities

Multi-turn conversational editing — The Gemini API supports stateful chat sessions for iterative image editing (e.g., "make it bluer" → "now add a hat" → "zoom out"). This requires fundamentally different stateful architecture and is not currently implemented. No downstream skill currently needs this.

References

references/prompts.md — Prompt examples, model-specific tips, multi-reference patterns
references/gemini-api.md — Curated API reference for agent context

nanobanana

Safety Notice

Copy this and send it to your AI assistant to learn

Generate image

Edit existing image

Multi-image reference

In your skill's script:

/// script

requires-python = ">=3.10"

dependencies = [

"google-genai>=1.0.0",

]

///

Step 1: Generate the style anchor

Step 2: Generate each image in the series, referencing the anchor

Source Transparency

Related Skills

thumbnail

plan-video

youtube-thumbnail

title