nanobanana

Nano Banana - AI Image Generation

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "nanobanana" with this command: npx skills add kenneth-liao/ai-launchpad-marketplace/kenneth-liao-ai-launchpad-marketplace-nanobanana

Nano Banana - AI Image Generation

Generate and edit images using Google Gemini models. Supports two models:

  • Pro (gemini-3-pro-image-preview ) — High quality, complex prompts, thinking mode

  • Flash (gemini-2.5-flash-image ) — Fast, cheap, good for iteration

Prerequisites

Required:

  • GEMINI_API_KEY — Get from Google AI Studio

  • uv (recommended) or Python 3.10+ with google-genai installed

With uv (recommended — zero setup): Dependencies are declared inline via PEP 723 and auto-installed on first run. Just use uv run instead of python3 .

With pip (fallback):

pip install -r ${CLAUDE_SKILL_DIR}/requirements.txt

Quick Start

Default output: Images save to ~/Downloads/nanobanana_<timestamp>.png automatically. Do NOT pass -o unless the user specifies where to save. If the user provides a filename without a directory (e.g., "save it as robot.png"), use -o ~/Downloads/robot.png .

Generate an image:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "a cute robot mascot, pixel art style"

Edit an existing image:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "make the background blue" -i input.jpg

Use Flash model for fast iteration:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "quick sketch of a cat" --model flash

Multi-image reference (style + subject):

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "apply the style of the first image to the second"
-i style_ref.png subject.jpg

Generate with specific aspect ratio and resolution:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "cinematic landscape" --ratio 21:9 --size 4K

Save to a specific location:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "logo design" -o ~/Projects/brand/logo.png

Model Selection Guide

Pro (default) Flash

Speed Slower ~2-3x faster

Cost Higher Lower

Text rendering Good Unreliable

Complex scenes Excellent Adequate

Thinking mode Yes No

Best for Final production images Exploration, drafts, batch

Rule of thumb: Use Flash for exploration and batch generation, Pro for final output.

Script Reference

scripts/generate.py

Main image generation script.

Usage: generate.py [OPTIONS] PROMPT

Arguments: PROMPT Text prompt for image generation

Options: -o, --output PATH Output file path (default: ~/Downloads/nanobanana_<timestamp>.png) -i, --input PATH... Input image(s) for editing / reference (up to 14) -m, --model MODEL Model: 'pro' (default), 'flash', or full model ID -r, --ratio RATIO Aspect ratio (1:1, 16:9, 9:16, 21:9, etc.) -s, --size SIZE Image size: 1K, 2K, or 4K (default: standard) --search Enable Google Search grounding for accuracy --retries N Max retries on rate limit (default: 3) -v, --verbose Show detailed output

Supported aspect ratios:

  • 1:1 — Square (default)

  • 2:3 , 3:2 — Portrait/Landscape

  • 3:4 , 4:3 — Standard

  • 4:5 , 5:4 — Photo

  • 9:16 , 16:9 — Widescreen

  • 21:9 — Ultra-wide/Cinematic

Image sizes:

  • 1K — Fast, lower detail

  • 2K — Enhanced detail (2048px)

  • 4K — Maximum quality (3840px), best for text rendering

scripts/batch_generate.py

Generate multiple images with sequential naming.

Usage: batch_generate.py [OPTIONS] PROMPT

Arguments: PROMPT Text prompt for image generation

Options: -n, --count N Number of images to generate (default: 10) -d, --dir PATH Output directory (default: ~/Downloads) -p, --prefix STR Filename prefix (default: "image") -m, --model MODEL Model: 'pro' (default), 'flash', or full model ID -r, --ratio RATIO Aspect ratio -s, --size SIZE Image size (1K/2K/4K) --search Enable Google Search grounding --retries N Max retries per image on rate limit (default: 3) --delay SECONDS Delay between generations (default: 3) --parallel N Concurrent requests (default: 1, max recommended: 5) -q, --quiet Suppress progress output

Example:

uv run ${CLAUDE_SKILL_DIR}/scripts/batch_generate.py "pixel art logo" -n 20 --model flash -d ./logos -p logo

Python API

Direct import (from another skill's script):

Note: When importing as a Python module, google-genai must be available in the calling script's environment. If using uv run , add a PEP 723 dependencies block to your own script (see example in Pattern 2 below).

import sys from pathlib import Path sys.path.insert(0, str(Path("${CLAUDE_SKILL_DIR}/scripts"))) from generate import generate_image, edit_image, batch_generate

Generate image

result = generate_image( prompt="a futuristic city at night", output_path="city.png", aspect_ratio="16:9", image_size="4K", model="pro", )

Edit existing image

result = edit_image( prompt="add flying cars to the sky", input_path="city.png", output_path="city_edited.png", )

Multi-image reference

result = generate_image( prompt="combine the color palette of the first with the composition of the second", input_paths=["palette_ref.png", "composition_ref.png"], output_path="combined.png", )

Return structure (always present):

{ "success": True, # or False "path": "/path/to/output.png", # or None on failure "error": None, # or error message string "metadata": { "model": "gemini-3-pro-image-preview", "prompt": "...", "aspect_ratio": "16:9", "image_size": "4K", "use_search": False, "input_images": None, # or list of paths "text_response": "...", # optional text from model "thinking": "...", # Pro model reasoning (when available) "timestamp": "2025-01-26T...", } }

Downstream Skill Integration Guide

Pattern 1: CLI wrapper (recommended for simple use)

In your skill's script:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py "{prompt}" --model flash --ratio 16:9 -o output.png

Pattern 2: Python import with custom defaults

/// script

requires-python = ">=3.10"

dependencies = [

"google-genai>=1.0.0",

]

///

import sys from pathlib import Path

NANOBANANA_DIR = Path("${CLAUDE_SKILL_DIR}/scripts") sys.path.insert(0, str(NANOBANANA_DIR)) from generate import generate_image

def generate_thumbnail(prompt: str, output_path: str) -> dict: """Generate a YouTube thumbnail with project defaults.""" return generate_image( prompt=prompt, output_path=output_path, aspect_ratio="16:9", image_size="2K", model="flash", max_retries=3, )

Pattern 3: Batch with progress tracking

from batch_generate import batch_generate

def on_progress(completed, total, result): print(f"Progress: {completed}/{total}")

results = batch_generate( prompt="logo concept", count=20, output_dir="./logos", prefix="logo", model="flash", aspect_ratio="1:1", on_progress=on_progress, )

successful = [r for r in results if r["success"]]

Pattern 4: Sequential generation for series

When a downstream skill needs multiple consistently-styled images (e.g., newsletter visuals, thumbnail A/B variants), use the anchor-and-reference pattern:

from generate import generate_image

Step 1: Generate the style anchor

anchor = generate_image( prompt="warm illustration style, earth tones, soft gradients, clean lines", output_path="anchor.png", model="pro", )

Step 2: Generate each image in the series, referencing the anchor

subjects = ["laptop on desk with coffee", "person reading a book", "sunrise over mountains"] series_paths = [anchor["path"]]

for i, subject in enumerate(subjects): result = generate_image( prompt=f"{subject}, matching the visual style and color palette of the reference image exactly", input_paths=[anchor["path"]], # always include the anchor output_path=f"series_{i+1:02d}.png", model="pro", ) if result["success"]: series_paths.append(result["path"])

The full sequential generation patterns are documented in the Sequential Generation section below.

Environment Variables

Variable Description Default

GEMINI_API_KEY

Google Gemini API key Required

IMAGE_OUTPUT_DIR

Default output directory ~/Downloads

Features

Text-to-Image Generation

Create images from text descriptions. Both models excel at:

  • Photorealistic images

  • Artistic styles (pixel art, illustration, etc.)

  • Product photography

  • Landscapes and scenes

Image Editing

Transform existing images with natural language:

  • Style transfer

  • Object addition/removal

  • Background changes

  • Color adjustments

Multi-Image Reference

Provide up to 14 reference images for:

  • Style consistency across a series

  • Subject consistency (same character, different poses)

  • Brand-consistent generation

  • Style + subject combination

High-Resolution Output

  • 1K — Fast generation, good for drafts

  • 2K — Enhanced detail (2048px)

  • 4K — Maximum quality (3840px), best for text rendering

Google Search Grounding

Enable --search for factually accurate images involving:

  • Real people, places, landmarks

  • Current events

  • Specific products or brands

Automatic Retry

Rate limit errors are automatically retried with exponential backoff (default: 3 retries). No action needed from callers.

SynthID Watermark Notice

All images generated by Gemini contain an invisible SynthID digital watermark. This is automatic, cannot be disabled, and survives common transformations (resize, crop, compression). Be aware of this for any use case requiring watermark-free output.

Sequential Generation

Use sequential generation to maintain visual consistency across a series of images. The core technique: generate an anchor image first, then pass it as a reference (-i ) for every subsequent image in the series.

Pattern 1: Style-Board Anchoring

Generate a single anchor image that establishes the visual identity for a series. Reference it for all subsequent images.

When to use: Newsletter visual series, A/B thumbnail variants, brand-consistent image batches.

Workflow:

  • Generate the anchor image with a prompt emphasizing style, palette, and mood:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"modern flat illustration style, warm earth tones, soft gradients, clean lines,
minimal detail, cozy atmosphere"
--model pro -o anchor.png

  • Generate each subsequent image referencing the anchor:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"a laptop on a desk with coffee, matching the visual style, color palette,
and lighting of the reference image exactly"
-i anchor.png --model pro -o image_01.png

  • Repeat step 2 for each image in the series, always referencing the same anchor.

Tip: Use Flash to draft the anchor quickly, then regenerate with Pro once you find a style you like.

Pattern 2: Subject Consistency

Keep the same character or subject looking consistent across different scenes and poses.

When to use: Mascot in multiple contexts, product photography series, recurring character.

Workflow:

  • Generate the initial subject with clear, detailed appearance description:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"a friendly robot mascot with round blue body, orange antenna, large expressive eyes,
simple geometric design, standing front-facing on white background"
--model pro -o subject_front.png

  • Generate new scenes referencing the subject:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"the same robot character from the reference image, now sitting at a desk typing,
same proportions and colors, office background"
-i subject_front.png --model pro -o subject_office.png

  • For stronger consistency, reference 2-3 of the best previous outputs:

uv run ${CLAUDE_SKILL_DIR}/scripts/generate.py
"the same robot character from the reference images, now outdoors in a park,
same proportions and colors, waving at the viewer"
-i subject_front.png subject_office.png --model pro -o subject_park.png

Pattern 3: Progressive Accumulation

Build a reference pool over a long series, adding each successful output as a reference for the next.

When to use: Series of 5+ images where consistency must compound across the full set.

Workflow:

  • Generate the anchor (same as Pattern 1, step 1).

  • Generate image 2 referencing the anchor.

  • Generate image 3 referencing anchor + image 2.

  • Continue, keeping the 3-4 strongest references in the -i list. Drop weaker outputs.

Why cap at 3-4 references: More references dilute the style signal. The model averages across all inputs — too many and the result loses coherence. Keep only the images that best represent the target style.

Reference ordering matters: Place the style anchor first in the -i list. The model weights earlier references slightly more.

Best Practices

Prompt Writing

Good prompts include:

  • Subject description

  • Style/aesthetic

  • Lighting and mood

  • Composition details

  • Color palette

See references/prompts.md for detailed prompt templates by category and model-specific tips.

Batch Generation Tips

  • Use --model flash for exploration batches (faster, cheaper)

  • Generate 10-20 variations to explore options

  • Default 3-second delay between sequential requests avoids rate limits

  • Review results and iterate on best candidates with Pro model

Rate Limits

  • Gemini API has usage quotas (~10 RPM free tier)

  • Automatic retry with exponential backoff handles transient rate limits

  • For large batches, use --delay 5 or --parallel with modest concurrency

  • Check your quota at Google AI Studio

Troubleshooting

"uv: command not found"

"Error: google-genai package not installed"

  • Use uv run instead of python3 to auto-install dependencies

  • Or install manually: pip install -r ${CLAUDE_SKILL_DIR}/requirements.txt

"GEMINI_API_KEY environment variable not set"

  • Set GEMINI_API_KEY in your environment before running

"No image in response"

  • Prompt may have triggered safety filters

  • Try rephrasing to avoid sensitive content

"Rate limit exceeded after N retries"

  • Wait 30-60 seconds and try again

  • Reduce batch parallelism or add longer delays

  • Check your API quota

Import errors in batch_generate.py

  • The script handles its own path setup; run from any directory

Future Capabilities

Multi-turn conversational editing — The Gemini API supports stateful chat sessions for iterative image editing (e.g., "make it bluer" → "now add a hat" → "zoom out"). This requires fundamentally different stateful architecture and is not currently implemented. No downstream skill currently needs this.

References

  • references/prompts.md — Prompt examples, model-specific tips, multi-reference patterns

  • references/gemini-api.md — Curated API reference for agent context

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

thumbnail

No summary provided by upstream source.

Repository SourceNeeds Review
General

plan-video

No summary provided by upstream source.

Repository SourceNeeds Review
General

youtube-thumbnail

No summary provided by upstream source.

Repository SourceNeeds Review
General

title

No summary provided by upstream source.

Repository SourceNeeds Review