nano-banana-2

Gemini image generation, editing, and search-grounded image creation via gemini-3.1-flash-image-preview (Nano Banana 2). USE FOR: - Generating images from text prompts (text-to-image) - Editing or transforming an existing image with text instructions - Generating images grounded in live web/image search results Requires GEMINI_API_KEY environment variable. See rules/setup.md for configuration and rules/security.md for output handling guidelines.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "nano-banana-2" with this command: npx skills add frank-bot07/nano-banana-2-gemini

nano-banana-2

Gemini image generation and editing via gemini-3.1-flash-image-preview. All output images are written to .nano-banana/ in the current project directory.

Prerequisites

GEMINI_API_KEY must be set in the environment. Verify with:

echo $GEMINI_API_KEY

If empty, see rules/setup.md. For output handling and security guidelines, see rules/security.md.

Workflow

Follow this escalation pattern:

  1. Generate - Create a new image from a text prompt only.
  2. Edit - Modify an existing local image with a text instruction.
  3. Search-Grounded - Generate informed by live web/image search results (use when current visual references, styles, or real-world accuracy matter).
GoalOperationWhen
Create image from scratchgenerateNo source image; prompt is self-contained
Modify or extend an existing imageeditHave a local PNG/JPEG to transform
Ground output in current web datasearch-groundedNeed up-to-date styles or real-world references

Output & Organization

All images are saved to .nano-banana/ in the current working directory. Add .nano-banana/ to .gitignore to prevent generated assets from being committed.

mkdir -p .nano-banana
echo ".nano-banana/" >> .gitignore

Naming conventions:

.nano-banana/gen-{slug}-{timestamp}.png
.nano-banana/edit-{slug}-{timestamp}.png
.nano-banana/search-{slug}-{timestamp}.png

Where {slug} is a short kebab-case label from the first 4-5 words of the prompt, and {timestamp} is YYYYMMDD-HHMMSS.

After saving, open to confirm the result:

open "$(ls -t .nano-banana/*.png | head -1)"

API Reference

PropertyValue
Modelgemini-3.1-flash-image-preview
Endpointhttps://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-flash-image-preview:generateContent
Auth headerx-goog-api-key: $GEMINI_API_KEY
Image outputcandidates[0].content.parts[].inlineData.data (base64 PNG)

Resolution options (imageConfig.imageSize)

ValueResolution
5120.5K (fastest)
10241K (default)
20482K
40964K

Aspect ratio options (imageConfig.aspectRatio)

1:1, 16:9, 9:16, 1:4, 4:1, 1:8, 8:1, 2:3, 3:2

Thinking mode (generationConfig.thinkingConfig.thinkingBudget)

An integer token budget. Set inside generationConfig, not at the top level.

ValueBehaviour
0Thinking off — fastest, lowest cost (default)
1024Light thinking
8192Deep thinking — recommended for grounded tasks

Operations

1. Generate (text-to-image)

python3 - <<'PYEOF'
import os, base64, json, urllib.request, datetime

api_key = os.environ["GEMINI_API_KEY"]
prompt  = "a majestic mountain at sunrise, photorealistic"
slug    = "mountain-sunrise"
size    = "1024"    # 512 | 1024 | 2048 | 4096
aspect  = "16:9"   # 1:1 | 16:9 | 9:16 | 1:4 | 4:1 | 1:8 | 8:1 | 2:3 | 3:2
thinking = 0       # 0 = off, 1024 = light, 8192 = deep

payload = {
    "contents": [{"parts": [{"text": prompt}]}],
    "generationConfig": {
        "responseModalities": ["TEXT", "IMAGE"],
        "imageConfig": {"imageSize": size, "aspectRatio": aspect},
        "thinkingConfig": {"thinkingBudget": thinking}
    }
}

url = (
    "https://generativelanguage.googleapis.com/v1beta/models/"
    "gemini-3.1-flash-image-preview:generateContent"
)
req = urllib.request.Request(
    url,
    data=json.dumps(payload).encode(),
    headers={"Content-Type": "application/json", "x-goog-api-key": api_key},
    method="POST"
)
with urllib.request.urlopen(req) as resp:
    data = json.load(resp)

ts  = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
out = f".nano-banana/gen-{slug}-{ts}.png"
os.makedirs(".nano-banana", exist_ok=True)

for part in data["candidates"][0]["content"]["parts"]:
    if part.get("inlineData", {}).get("mimeType", "").startswith("image/"):
        with open(out, "wb") as f:
            f.write(base64.b64decode(part["inlineData"]["data"]))
        print(f"Saved: {out}")
        break
    elif part.get("text"):
        print("Model:", part["text"])
PYEOF

2. Edit (image-to-image)

The source image is base64-encoded and sent alongside the instruction text. Supports PNG and JPEG inputs.

python3 - <<'PYEOF'
import os, base64, json, urllib.request, datetime

api_key     = os.environ["GEMINI_API_KEY"]
source_img  = "path/to/source.png"          # change to actual path
instruction = "Make the sky purple and add stars"
slug        = "purple-sky-stars"
size        = "1024"
aspect      = "1:1"
thinking    = 0   # 0 = off, 1024 = light, 8192 = deep

with open(source_img, "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

ext  = source_img.rsplit(".", 1)[-1].lower()
mime = "image/jpeg" if ext in ("jpg", "jpeg") else "image/png"

payload = {
    "contents": [{
        "parts": [
            {"text": instruction},
            {"inline_data": {"mime_type": mime, "data": img_b64}}
        ]
    }],
    "generationConfig": {
        "responseModalities": ["TEXT", "IMAGE"],
        "imageConfig": {"imageSize": size, "aspectRatio": aspect},
        "thinkingConfig": {"thinkingBudget": thinking}
    }
}

url = (
    "https://generativelanguage.googleapis.com/v1beta/models/"
    "gemini-3.1-flash-image-preview:generateContent"
)
req = urllib.request.Request(
    url,
    data=json.dumps(payload).encode(),
    headers={"Content-Type": "application/json", "x-goog-api-key": api_key},
    method="POST"
)
with urllib.request.urlopen(req) as resp:
    data = json.load(resp)

ts  = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
out = f".nano-banana/edit-{slug}-{ts}.png"
os.makedirs(".nano-banana", exist_ok=True)

for part in data["candidates"][0]["content"]["parts"]:
    if part.get("inlineData", {}).get("mimeType", "").startswith("image/"):
        with open(out, "wb") as f:
            f.write(base64.b64decode(part["inlineData"]["data"]))
        print(f"Saved: {out}")
        break
    elif part.get("text"):
        print("Model:", part["text"])
PYEOF

3. Search-Grounded Generation

Adds googleSearch with both webSearch and imageSearch types to ground the output in live web data. Use when the prompt references real-world subjects, current styles, recent events, or factual visual accuracy.

python3 - <<'PYEOF'
import os, base64, json, urllib.request, datetime

api_key  = os.environ["GEMINI_API_KEY"]
prompt   = "Generate a product photo of the latest iPhone model"
slug     = "latest-iphone-product"
size     = "1024"
aspect   = "1:1"
thinking = 8192   # deeper thinking recommended for grounded generation

payload = {
    "contents": [{"parts": [{"text": prompt}]}],
    "tools": [{
        "googleSearch": {
            "searchTypes": ["webSearch", "imageSearch"]
        }
    }],
    "generationConfig": {
        "responseModalities": ["TEXT", "IMAGE"],
        "imageConfig": {"imageSize": size, "aspectRatio": aspect},
        "thinkingConfig": {"thinkingBudget": thinking}
    }
}

url = (
    "https://generativelanguage.googleapis.com/v1beta/models/"
    "gemini-3.1-flash-image-preview:generateContent"
)
req = urllib.request.Request(
    url,
    data=json.dumps(payload).encode(),
    headers={"Content-Type": "application/json", "x-goog-api-key": api_key},
    method="POST"
)
with urllib.request.urlopen(req) as resp:
    data = json.load(resp)

ts  = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
out = f".nano-banana/search-{slug}-{ts}.png"
os.makedirs(".nano-banana", exist_ok=True)

for part in data["candidates"][0]["content"]["parts"]:
    if part.get("inlineData", {}).get("mimeType", "").startswith("image/"):
        with open(out, "wb") as f:
            f.write(base64.b64decode(part["inlineData"]["data"]))
        print(f"Saved: {out}")
        break
    elif part.get("text"):
        print("Model:", part["text"])
PYEOF

Working with Results

# List all generated images
ls -lh .nano-banana/

# Open the most recent
open "$(ls -t .nano-banana/*.png | head -1)"

# Open all images generated today
open .nano-banana/*$(date +%Y%m%d)*.png

Error Handling

If the API returns an error, the response will contain an error key. Print it with:

python3 -c "
import json, sys
d = json.loads(sys.stdin.read())
if 'error' in d:
    print('API Error:', json.dumps(d['error'], indent=2))
elif not d.get('candidates'):
    print('No candidates:', json.dumps(d, indent=2))
"

Common errors:

Error codeCause
API_KEY_INVALIDGEMINI_API_KEY not set or incorrect
RESOURCE_EXHAUSTEDQuota exceeded; check billing or wait
INVALID_ARGUMENTBad imageSize or aspectRatio value
Empty candidatesSafety filter blocked the prompt or source image
404 Not FoundModel not yet available on your API key; see setup

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Maliang Image

Generate images from text prompts or edit existing images with AI. Powered by Google Gemini via Maliang API. Free $6 credit on first use (~10 images). Suppor...

Registry SourceRecently Updated
1940Profile unavailable
General

TitleClash

Compete in TitleClash - write creative titles for images and win votes. Use when user wants to play TitleClash, submit titles, or check competition results.

Registry SourceRecently Updated
8882Profile unavailable
General

vwu.ai gemini Models

调用vwu.ai平台上的五款gemini系列模型,支持标准OpenAI聊天接口,需配置vwu.ai API key后使用。

Registry SourceRecently Updated
230Profile unavailable
General

feishu-doc-extended

飞书文档扩展工具,提供图片下载和 OCR 识别功能。需要配合内置 feishu 插件使用。

Registry SourceRecently Updated
671Profile unavailable