Gemini Image Generator

Gemini NanoBananaを使った汎用AI画像生成スキル。

When to Use This Skill

Trigger when user:

Asks to generate/create images with AI
Mentions "Gemini image", "generate picture", "create artwork"
Requests visual content from text descriptions
Wants to produce illustrations or graphics
Wants to create images matching a reference image's style (NEW!)

For specific use cases, use specialized skills:

LP/セールスレター画像 → gemini-lp-generator
ウェビナースライド → gemini-slide-generator

Quick Start

cd /path/to/gemini-image-generator

1. Check authentication

python scripts/run.py auth_manager.py status

2. Authenticate (if needed)

python scripts/run.py auth_manager.py setup

3. Generate image (basic)

python scripts/run.py image_generator.py
--prompt "sunset over mountains, watercolor style"
--output output/my_image.png

4. Generate with reference image (NEW!)

python scripts/run.py image_generator.py
--prompt "犬を描いて"
--reference-image "/path/to/reference.png"
--output output/styled_dog.png

How It Works

Basic Mode (テキストのみ)

Navigate to gemini.google.com
Click "ツール" (Tools) button
Select "画像を作成" (Create Image) - Activates NanoBanana
Enter prompt and generate
Download generated image

Reference Image Mode (参考画像あり) - NEW!

Upload reference image to Gemini
AI analyzes visual elements (style, colors, lighting, etc.)
Extract analysis as YAML format
Generate optimized meta-prompt
Create new image with matching style

┌────────────────┐ ┌────────────────┐ ┌────────────────┐ │ 📷 Reference │ → │ 📋 YAML │ → │ 📝 Optimized │ │ Image │ │ Analysis │ │ Prompt │ └────────────────┘ └────────────────┘ └────────────────┘ │ ▼ ┌────────────────┐ │ 🖼️ Generated │ │ Image │ └────────────────┘

Parameters

Parameter Required Default Description

--prompt

Yes

Image generation prompt

--output

No output/generated_image.png

Output file path

--reference-image

No

Reference image for style extraction

--yaml-output

No

Save YAML analysis to file

--show-browser

No False Show browser for debugging

--timeout

No 180 Max wait time in seconds

Prompt Examples

Basic Examples (テキストのみ)

Landscape

python scripts/run.py image_generator.py
--prompt "serene sunset over snow-capped mountains, warm orange sky, photorealistic"

Art style

python scripts/run.py image_generator.py
--prompt "watercolor painting of a cat sitting by window, soft colors"

Product photo

python scripts/run.py image_generator.py
--prompt "professional product photography, white background, soft lighting"

Reference Image Examples (参考画像あり) - NEW!

Match style of reference image

python scripts/run.py image_generator.py
--prompt "犬を描いて"
--reference-image "examples/watercolor_cat.png"
--output output/watercolor_dog.png

Save YAML analysis for review

python scripts/run.py image_generator.py
--prompt "森の風景"
--reference-image "examples/sunset.jpg"
--yaml-output output/analysis.yaml
--output output/forest.png

Debug mode with browser visible

python scripts/run.py image_generator.py
--prompt "カフェの内装"
--reference-image "examples/cozy_room.png"
--show-browser
--output output/cafe.png

Standalone Tools

Extract YAML only (without generating image)

python scripts/run.py prompt_extractor.py
--image "examples/reference.png"
--output analysis.yaml

Generate prompt from YAML

python scripts/run.py meta_prompt.py
--yaml analysis.yaml
--request "猫を描いて"

Authentication

This skill manages browser authentication for all Gemini-based skills:

gemini-slide-generator (shares browser profile)
gemini-lp-generator (shares browser profile)

Check status

python scripts/run.py auth_manager.py status

Setup (opens browser for Google login)

python scripts/run.py auth_manager.py setup

Clear session

python scripts/run.py auth_manager.py clear

Troubleshooting

Problem Solution

Not authenticated Run auth_manager.py setup

Timeout Increase with --timeout 300

UI not found Use --show-browser to debug

Generation refused Modify prompt (avoid restricted content)

Data Storage

data/browser_profile/
Browser session (shared with other Gemini skills)
data/state.json
Authentication state
output/
Generated images

Architecture

scripts/ ├── config.py # Centralized settings ├── browser_utils.py # BrowserFactory and StealthUtils ├── auth_manager.py # Authentication management ├── image_generator.py # Image generation (with reference image support) ├── prompt_extractor.py # Extract visual elements as YAML (NEW!) ├── meta_prompt.py # Generate optimized prompts from YAML (NEW!) └── run.py # Wrapper script for venv

docs/ └── UPGRADE_SPEC.md # Feature specification with diagrams

Notes

First generation takes longer (browser startup)
Subsequent generations faster (session reuse)
Authentication persists ~7 days
UI selectors may break when Gemini updates

gemini-image-generator

Safety Notice

Copy this and send it to your AI assistant to learn

1. Check authentication

2. Authenticate (if needed)

3. Generate image (basic)

4. Generate with reference image (NEW!)

Yes

No

No

Landscape

Art style

Product photo

Match style of reference image

Save YAML analysis for review

Debug mode with browser visible

Extract YAML only (without generating image)

Generate prompt from YAML

Check status

Setup (opens browser for Google login)

Clear session

Source Transparency

Related Skills

gemini-image-generator

gemini-image-generator

gemini-image-generator

gemini-image-generator