Nanobanana Image Generation
Overview
This skill now supports two modes:
imagemode Gemini or Nanobanana generation and editing through the officialgenerateContentflowplotmode Exact Python or matplotlib rendering of publication-style figures from numeric data
Use image mode for mechanism figures, graphical abstracts, device schematics, style-matched redraws, and diagram-first work.
Use plot mode for exact bar charts, trend curves, heatmaps, scatter plots, and multi-panel figures that must preserve numeric truth.
When the user is working in Codex and describes a plot in natural language, do not require them to hand-write a JSON spec. Codex should translate the request into an internal plot request or spec and run the plotting scripts.
For image mode, follow Google's official examples and replace:
- API key with the provider key
- base URL with the chosen Google-compatible Gemini endpoint
Do not use OpenAI-style /images/generations or /images/edits routes for this skill.
Attachment-Only Inputs
If the image exists only as a chat attachment and the platform does not expose a local file path, do not claim the script can upload it directly.
Use this rule:
- If the user needs an exact edit of the original uploaded pixels, ask for the local file path first.
- If the user accepts a close recreation, analyze the attached image visually and generate a new image that preserves the original composition and style as closely as possible.
For requests like "replace the English text in this attached image with Chinese", the fallback recreation workflow is acceptable when exact pixel-preserving edit is impossible.
Quick Start
Set environment variables:
export NANOBANANA_API_KEY="your-provider-key"
export NANOBANANA_BASE_URL="https://generativelanguage.googleapis.com"
export NANOBANANA_MODEL="gemini-3.1-flash-image-preview"
Optional third-party provider:
export NANOBANANA_BASE_URL="https://api.zhizengzeng.com/google"
export NANOBANANA_ALLOW_THIRD_PARTY=1
If you do not want the API key to appear in the command line, store it in a file and use:
export NANOBANANA_API_KEY_FILE="$PWD/.secrets/nanobanana_api_key"
Generate an image:
python3 scripts/generate_image.py "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme"
Edit an image:
python3 scripts/generate_image.py "Using the provided image, change only the blue sofa to a vintage brown leather Chesterfield sofa. Keep everything else exactly the same." --input-image ./living-room.png
Recreate an attached diagram with translated labels:
python3 scripts/generate_image.py "Recreate the attached pastel technical diagram with the same layout, icons, arrows, and hand-drawn style. Replace all visible English labels with natural Simplified Chinese. Keep the composition unchanged." --aspect-ratio 16:9 --image-size 2K
Safety note:
scripts/build_materials_figure_prompt.pyand--print-promptare local-only and do not send data over the network.- Actual prompt text, API keys, and user-provided input images are sent only when you run the generation scripts against the configured provider.
- Non-official Gemini-compatible endpoints require explicit confirmation via
--allow-third-partyorNANOBANANA_ALLOW_THIRD_PARTY=1.
Workflow
Choose a mode first:
- If the user supplied numeric data and needs exact plotting, use
plotmode. Read references/publication-plot-api.md and runscripts/plot_publication_figure.py. For natural-language requests, also read references/natural-language-plot-workflow.md. - If the user needs a schematic, graphical abstract, or image editing workflow, use
imagemode. Follow the GeminigenerateContentflow below.
For image mode:
- Keep the official Gemini request shape.
Use
POST /v1beta/models/{model}:generateContentwithX-goog-api-key. - Put prompt text and image inputs into
contents[].parts. Text-only generation uses one text part. Image editing appends one or more inline image parts. - Put image options in
generationConfig.imageConfig. Prefer--aspect-ratioand--image-size, matching the official docs. - For materials-science figures, prefer building the final prompt first.
Use
python3 scripts/build_materials_figure_prompt.py --materials-figure ...when you want to inspect or refine the prompt before sending any API request. - For publication-style research figures, load the bundled design guides as needed. Read references/publication-figure-design.md for house style, palette semantics, typography, and panel logic.
- If the figure contains chart-like panels, read references/publication-chart-patterns.md. Use those patterns to specify grouped bars, heatmaps, trend layouts, dedicated legends, and wide comparison panels.
- Save image outputs from
candidates[0].content.parts[].inlineData. Save text parts too when returned. - If the source image is attachment-only, choose between exact edit and recreation. Ask for a local path for exact editing. Use recreation if the user wants the result and accepts a visually matched redraw.
For plot mode:
- Read references/publication-plot-api.md.
- If the user is speaking naturally, infer the plotting intent and data structure. Do not ask the user to author the internal spec unless they explicitly want low-level control.
- For concise internal translation, optionally create a request JSON and expand it with
scripts/build_plot_spec.py. - Build or generate a JSON spec with top-level
style,layout, andpanels. - Use
bar,trend,heatmap,scatter,legend, oremptypanels. - Render with:
python3 skills/nanobanana-image-generation/scripts/plot_publication_figure.py spec.json
- Export exact PNG, PDF, or SVG outputs.
Environment
Required:
NANOBANANA_API_KEYNANOBANANA_BASE_URLMust be set explicitly. Official Google endpoint:https://generativelanguage.googleapis.com
Optional:
NANOBANANA_MODELDefault:gemini-3.1-flash-image-previewNANOBANANA_TIMEOUTDefault:120NANOBANANA_API_KEY_FILEPath to a file containing the API key. Prefer this when you do not want the key shown in command history or command logs.NANOBANANA_ALLOW_THIRD_PARTYSet to1only when you intentionally want to send API keys and user-provided files to a non-official Gemini-compatible provider.
Scripts
scripts/generate_image.pyPython CLI that follows the official GeminigenerateContentrequest shape.scripts/generate_image.jsNode.js CLI with the same request format.scripts/plot_publication_figure.pyPython CLI for exact publication-style plotting from JSON specs.scripts/build_plot_spec.pyPython CLI that expands a concise request JSON into a full plotting spec.
Common options:
--input-image ./source.png--prompt-file ./background.md--aspect-ratio 16:9--image-size 2K--text-only--thinking-level high--include-thoughts--materials-figure mechanism-figure--lang zh--style-note "Nature Energy style"--print-prompt--allow-third-party--api-key-file ./.secrets/nanobanana_api_key
Default output location:
./output/nanobanana/relative to the current Codex working directory- Override only when the user explicitly wants another folder
Deterministic plotting:
python3 skills/nanobanana-image-generation/scripts/plot_publication_figure.py ./spec.json \
--out-path ./output/plots/result \
--formats png pdf svg \
--dpi 300
Natural-language-friendly internal workflow:
python3 skills/nanobanana-image-generation/scripts/build_plot_spec.py ./request.json --out ./spec.json
python3 skills/nanobanana-image-generation/scripts/plot_publication_figure.py ./spec.json
Official Mapping
Official Google examples:
api_key="GEMINI_API_KEY"base_url="https://generativelanguage.googleapis.com"
Third-party provider replacements:
api_key="your_provider_api_key"base_url="your_google_compatible_endpoint"allow_third_party=true
Optional Zhizengzeng example:
api_key="your_zzz_api_key"base_url="https://api.zhizengzeng.com/google"allow_third_party=true
Everything else should stay aligned with the official Gemini documentation.
Prompting Rules
- For generation, describe the scene instead of dumping keywords.
- For editing, explicitly say what must stay unchanged.
- For multi-image workflows, describe the role of each reference image.
- Prefer English or
zh-CNprompts when image fidelity matters. - For attachment-only translation tasks, list each label that must be rewritten so the regenerated image does not miss text.
- If layout fidelity matters, explicitly say to preserve icon positions, arrows, spacing, hierarchy, and reading order.
- For publication figures, specify semantic color roles, panel order, arrow logic, and which elements should stay neutral.
- Keep figure text short. Prefer concise labels and legend entries over paragraph-like annotations baked into the image.
- If the figure resembles a plot, say whether it is a conceptual chart, a style-matched redraw, or an exact quantitative reproduction.
Materials Science Figure Shortcut
If the user asks for a materials-science paper figure, journal-style scientific schematic, graphical abstract, mechanism diagram, synthesis workflow figure, microstructure-property diagram, device architecture figure, or characterization-plan figure, use the bundled materials-science templates instead of writing the prompt from scratch.
Workflow:
- Read references/materials-science-figure-template.md.
- Pick the closest subtype:
graphical-abstractmechanism-figuredevice-architectureprocessing-workflow
- Choose the output language:
enzh
- Insert the user's scientific content into the
Scientific Backgroundslot, or use the script shortcut directly. - Preserve the template's constraints about causality, palette, typography, layout, and avoiding unsupported claims.
- If the user did not provide exact numbers, keep labels qualitative or explicitly use placeholders rather than fabricating data.
- If the user wants a specific journal style, append that preference after the template rather than rewriting the template.
- If the scientific background is long, put it in a markdown file and use
--prompt-fileorscripts/build_materials_figure_prompt.py --background-file ...instead of squeezing it into one shell argument. - For prompt refinement, consult:
Research Figure Design Integration
This skill includes a distilled publication-figure playbook adapted from the figures4papers project. Use it to make Nanobanana outputs look like journal figures rather than generic AI art.
Read the reference files only as needed:
- references/publication-figure-design.md Use for overall figure art direction: typography, palette semantics, panel hierarchy, white-background policy, legend handling, and print-safe simplification.
- references/publication-chart-patterns.md Use when the figure contains bars, trend lines, heatmaps, comparison matrices, or dedicated legend panels.
Apply these rules when prompting:
- Keep the overall composition minimal, high-contrast, and panel-driven.
- Use blue for the primary mechanism or proposed method, green for improvements, red for contrasts, and neutral gray for scaffolds/background categories.
- Ask for short professional labels, frameless legends, and uncluttered white backgrounds.
- Preserve consistent visual encoding across panels so the same color always means the same phase, state, or method.
- For chart-like figures, ask the model to mimic publication layout and styling, but do not imply exact quantitative correctness unless the figure is being recreated from provided source data or reference images.
Quantitative Boundary
This skill is strong for:
- graphical abstracts
- mechanism figures
- device schematics
- processing workflows
- chart-like conceptual panels
- style-matched redraws of existing paper figures
This skill is not a guarantee of exact quantitative plotting. If the user needs exact bar heights, exact heatmap values, or faithful axis tick math from raw numbers, treat Nanobanana as a layout or visual-direction tool unless the request is explicitly a redraw from a trusted reference image.
For exact plotting, switch to plot mode and use references/publication-plot-api.md plus scripts/plot_publication_figure.py.
Python shortcut:
python3 scripts/generate_image.py "paste the scientific background here" \
--materials-figure mechanism-figure \
--lang en \
--style-note "Benchmark the figure against Nature Materials aesthetics." \
--aspect-ratio 4:3 \
--image-size 2K
JavaScript shortcut:
node scripts/generate_image.js "paste the scientific background here" \
--materials-figure graphical-abstract \
--lang zh \
--aspect-ratio 4:3 \
--image-size 2K
Prompt-only preflight:
python3 scripts/build_materials_figure_prompt.py \
--materials-figure mechanism-figure \
--lang en \
--background-file ./background.md \
--style-note "Nature Materials aesthetic with concise panel labels."
Failure Handling
- If the API returns
401or403, verifyNANOBANANA_API_KEY. - If the CLI says the base URL is missing, set
NANOBANANA_BASE_URLor pass--base-url. - If the CLI refuses a non-official endpoint, add
--allow-third-partyor setNANOBANANA_ALLOW_THIRD_PARTY=1only if that provider is intentional. - If the API returns
404, verify that the request is going to/v1beta/models/{model}:generateContent. - If the provider says the model does not exist, verify the exact model name in the official docs and the provider's supported model list.
- If no image is returned, inspect
candidates[0].content.partsand check whether the request asked for image output. - If the user supplied only a chat attachment and no file path, do not describe the result as an exact edit unless the platform actually exposed the attachment bytes.
References
- Read references/api-reference.md for the official request shape.
- Read references/prompt-templates.md for generation and editing prompt scaffolds.
- Read references/materials-science-figure-template.md when generating materials-science paper figures.
- Read references/publication-figure-design.md for publication-style research figure rules adapted from
figures4papers. - Read references/publication-chart-patterns.md for chart and multi-panel layout patterns.
- Read references/publication-plot-api.md for exact plotting from numeric data.
- Read references/natural-language-plot-workflow.md when the user describes an exact plot in natural language and Codex needs to translate it into an internal plotting request.