image-gen

Generate images using Google's Nano Banana 2 (Gemini 3.1 Flash Image) with workflow-based prompting

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "image-gen" with this command: npx skills add krishagel/geoffrey/krishagel-geoffrey-image-gen

Image Generation Skill

Generate professional images, infographics, and diagrams using Google's Nano Banana 2 model (gemini-3.1-flash-image-preview).

Model Capabilities

Nano Banana 2 (released February 26, 2026):

  • Text rendering - Accurate, legible text in images
  • Google Search grounding - Real-time data (weather, stocks, etc.)
  • Subject consistency - Up to 5 characters maintained across generations
  • Multi-turn conversation - Iterative refinement
  • Up to 14 reference images - For composition and style transfer
  • Resolutions: 1K, 2K, 4K
  • Aspect ratios: 1:1, 2:3, 3:2, 4:3, 16:9, 21:9

Scripts

All scripts use Python via uv run with inline dependencies.

generate.py - Text to Image

uv run scripts/generate.py "prompt" output.png [aspect_ratio] [size]

Examples:

# Basic image
uv run scripts/generate.py "A cozy coffee shop in autumn" coffee.png

# Infographic with specific aspect ratio
uv run scripts/generate.py "Infographic explaining how neural networks work" nn.png 16:9 2K

# 4K professional image
uv run scripts/generate.py "Professional headshot, studio lighting" headshot.png 3:2 4K

edit.py - Image Editing

uv run scripts/edit.py input.png "edit instructions" output.png

Examples:

# Edit existing image
uv run scripts/edit.py photo.png "Change the background to a beach sunset" edited.png

compose.py - Multi-Image Composition

uv run scripts/compose.py "prompt" output.png --refs image1.png image2.png

Examples:

# Combine styles from multiple images
uv run scripts/compose.py "Combine these styles into a logo" logo.png --refs style1.png style2.png

Workflows

Workflows provide structured approaches for specific visual types. Each workflow follows the PAI 6-step editorial process:

  1. Extract narrative - Understand the complete story/concept
  2. Derive visual concept - Single metaphor with 2-3 physical objects
  3. Apply aesthetic - Define style, colors, mood
  4. Construct prompt - Build detailed generation instructions
  5. Generate - Execute via script
  6. Validate - Check against criteria, regenerate if needed

Available Workflows

  • infographic.md - Data visualization, statistics, explainers
  • diagram.md - Technical diagrams, flowcharts, architecture

Workflow Usage

When generating images, follow the appropriate workflow:

For Infographics

1. What data/concept needs visualization?
2. What's the key insight or takeaway?
3. Aspect ratio: 16:9 (landscape) recommended
4. Include: clear hierarchy, minimal text, supporting icons
5. Generate at 2K minimum for text clarity

For Diagrams

1. What system/process is being illustrated?
2. What are the key components and relationships?
3. Style: flat colors, clean lines, minimal detail
4. Generate at 2K for label clarity

Environment Setup

Requires GEMINI_API_KEY environment variable. This should be set from Geoffrey's secrets:

source ~/Library/Mobile\ Documents/com~apple~CloudDocs/Geoffrey/secrets/.env

Best Practices

Infographics

  • Use simple, direct prompts: "Infographic explaining how X works"
  • Model auto-includes relevant icons/logos
  • 16:9 aspect ratio works best
  • Generate at 2K+ for readable text

General

  • Multi-turn refinement: generate, then ask for specific changes
  • Reference images improve consistency
  • Be specific about style, mood, lighting
  • SynthID watermark is automatic (Google provenance)

Output Location

By default, save images to /tmp/ or user-specified paths. For persistent storage, use:

~/Library/Mobile Documents/com~apple~CloudDocs/Geoffrey/images/

⚠️ CRITICAL: Never Read Generated Images

DO NOT use the Read tool on generated images.

Why:

  • 4K images (3840x2160) are within the 8000px limit
  • 2K images (2560x1440) are also safe
  • BUT: Do not Read them - they're for user consumption, not analysis
  • For edits, use edit.py script, not Read tool

Workflow:

  1. Generate image with script
  2. Return file path to user
  3. User views the high-quality output

Limitations

  • No photorealistic humans (safety filter)
  • No copyrighted characters
  • Maximum 14 reference images for composition
  • 4K only available with Nano Banana 2 and Nano Banana Pro

Pricing

SizeCost per Image
1KFree tier / $0.04
2K$0.134
4K$0.24

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

clawdbot-monitor

No summary provided by upstream source.

Repository SourceNeeds Review
General

morning-briefing

No summary provided by upstream source.

Repository SourceNeeds Review
General

browser-control

No summary provided by upstream source.

Repository SourceNeeds Review
General

google-workspace

No summary provided by upstream source.

Repository SourceNeeds Review