gpt-image-2-generation
Generate images from natural-language prompts using the gpt-image-2 model hosted at WellAPI. The skill calls POST https://wellapi.ai/v1/images/generations, decodes the returned b64_json, and writes the image to disk.
When to use
Trigger this skill when the user asks for things like:
- "Generate an image of a sunset over the ocean"
- "Draw a cat wearing a top hat"
- "Create a 1024x1024 picture of …"
- "Make an illustration / poster / artwork of …"
- Any other request to produce a visual from a textual description.
If the user asks for image editing (in-painting, variations, etc.), this skill currently only covers text-to-image generation; tell the user so.
Prerequisites
- API key — The user must supply a WellAPI key.
- If the environment variable
WELLAPI_API_KEYis set, it will be used. - Otherwise the skill looks for a local config file:
- Linux/macOS:
~/.config/gpt-image-2-generation/config.json - Windows:
%USERPROFILE%\.config\gpt-image-2-generation\config.json
- Linux/macOS:
- If neither is present, the skill prompts the user to register and enter a key (see First-run onboarding below).
- If the environment variable
- Python 3.7+ with the standard library only (no third-party packages required).
First-run onboarding (no API key found)
When no key can be located, before making any API call tell the user verbatim (translate to the user's language if appropriate):
使用本技能需要一个免费的 WellAPI API Key。请前往 https://wellapi.ai/register?aff=DTDH 注册账号并领取免费 API Key,然后把它粘贴到下方。
To use this skill you need a free WellAPI API key. Please register at https://wellapi.ai/register?aff=DTDH to obtain one, then paste it below.
Then call the helper script to securely store the key:
python3 scripts/setup_api_key.py
The script will prompt the user for the key on stdin and persist it to the per-user config file with 0600 permissions. Never echo or log the full key after it is captured.
How to generate an image
Use the bundled script scripts/generate_image.py. It accepts CLI arguments, builds the request, sends it with Authorization: Bearer <key>, decodes the base64 image, and writes the file.
Required argument
| Flag | Meaning |
|---|---|
--prompt | The text description of the image to generate. |
Optional arguments (defaults match the WellAPI example)
| Flag | Default | Allowed values |
|---|---|---|
--n | 1 | integer, number of images |
--size | 1024x1024 | e.g. 512x512, 1024x1024, 1024x1536, 1536x1024 |
--quality | low | low, medium, high |
--format | jpeg | jpeg, png, webp |
--model | gpt-image-2 | model name |
--output | ./gpt-image-2_<timestamp>.<format> | output file path. When --n > 1, an index suffix is added. |
--api-key | (auto) | overrides env / config file |
Example invocations
# Minimal
python3 scripts/generate_image.py --prompt "大海"
# Custom size + format + output path
python3 scripts/generate_image.py \
--prompt "A futuristic city skyline at dusk, cyberpunk style" \
--size 1024x1024 \
--quality high \
--format png \
--output ./city.png
The script prints the absolute path(s) of the saved image(s) on success and exits non-zero on failure.
Request / response contract
Request body sent to https://wellapi.ai/v1/images/generations:
{
"model": "gpt-image-2",
"prompt": "大海",
"n": 1,
"size": "1024x1024",
"quality": "low",
"format": "jpeg"
}
Headers
Authorization: Bearer <WELLAPI_API_KEY>
Content-Type: application/json
Response (the image is in data[i].b64_json):
{
"created": 1778236581,
"data": [{ "b64_json": "iVBORw0KGg..." }],
"output_format": "png",
"quality": "low",
"size": "1024x1024",
"usage": { "input_tokens": 8, "output_tokens": 196, "total_tokens": 204 }
}
The skill base64-decodes each b64_json entry and writes the bytes to disk using output_format (or the requested --format) as the file extension.
Workflow for the agent
- Parse the user's image request → extract
prompt, and any explicitsize,quality,format,n. - Resolve the API key (env → config file → prompt user via
scripts/setup_api_key.py). - Run
scripts/generate_image.pywith the parsed arguments. - Report the saved file path(s) to the user. If running in an environment that can render images, also display the result.
- On HTTP errors, surface the upstream error message verbatim and suggest checking the API key, quota, or prompt content.
Files in this skill
SKILL.md— this file (metadata + instructions)scripts/generate_image.py— performs the generationscripts/setup_api_key.py— interactive helper to store the API keyscripts/api_key.py— shared helpers for locating/loading the keyREADME.md— marketplace listingLICENSE— MIT license
Security notes
- The API key is stored locally in the user's home directory with
0600permissions and is never committed, logged, or echoed. - All network traffic goes only to
https://wellapi.ai. - The skill does not execute or evaluate any data returned by the API beyond base64-decoding the image bytes.