Media Gen 🎬

Generate images and videos with a single AIsa API key.

This skill covers the AIsa media-generation routes exposed across three image endpoints and one async video endpoint. The bundled client in scripts/media_gen_client.py picks the correct request shape for each supported model, including the schema differences between Wan video variants.

Use when

You want one neutral skill for AIsa image and video generation
You need to switch between Gemini image, Wan image, Seedream, and Wan video models without rewriting requests
You want a simple CLI for creating images, submitting async video jobs, polling task status, and downloading finished video output

Compatibility

Works with any agentskills.io-compatible harness, including:

Claude Code and Claude
OpenAI Codex
Cursor
Gemini CLI
OpenCode, Goose, OpenClaw, Hermes
and other tools that implement the Agent Skills specification

Requires Python 3, a POSIX shell, and AISA_API_KEY from aisa.one.

What you can do

Image — Gemini (base64 inline)

"Generate a cyberpunk-style city nightscape, neon lights, rainy night, cinematic feel"

Image — Wan 2.7 (URL in chat response)

"Generate an ultra-detailed product shot of a red panda, studio lighting, sharp focus"

Image — Seedream (OpenAI-compatible, large format)

"Generate a 2048×2048 magazine cover: neo-noir detective portrait, film grain"

Video — text-to-video (Wan t2v)

"Sweeping establishing shot of a neon cyberpunk skyline at dusk, 5 seconds"

Video — image-to-video (Wan i2v)

"Starting from this reference image, gentle camera push-in with parallax"

Supported models

Image generation — 4 models, 3 endpoints

Model	Developer	Endpoint	Notes
`gemini-3-pro-image-preview`	Google	`POST /v1/models/{model}:generateContent`	Images returned as base64 in `candidates[].parts[].inline_data`
`wan2.7-image`	Alibaba	`POST /v1/chat/completions`	Images returned as URL parts in `choices[].message.content[]` (`type=image`)
`wan2.7-image-pro`	Alibaba	`POST /v1/chat/completions`	Higher fidelity
`seedream-4-5-251128`	ByteDance	`POST /v1/images/generations`	OpenAI-compatible; minimum 3,686,400 pixels

Video generation — 4 Wan variants, 1 endpoint

Model	Kind	Image field	Output SR
`wan2.6-t2v`	text-to-video	none	1080
`wan2.6-i2v`	image-to-video	`input.img_url` (string)	720
`wan2.7-t2v`	text-to-video	none	720
`wan2.7-i2v`	image-to-video	`input.media` (array)	720

Important: wan2.7-i2v expects the reference image in input.media as an array of URLs, not input.img_url like wan2.6-i2v. The bundled client handles this automatically when you pass --img-url.

Quick start

export AISA_API_KEY="your-key"

# Any image model — the client routes to the right endpoint
python3 scripts/media_gen_client.py image \
  --model gemini-3-pro-image-preview \
  --prompt "A cute red panda, cinematic lighting" \
  --out out.png

python3 scripts/media_gen_client.py image \
  --model wan2.7-image-pro \
  --prompt "Ultra-detailed product shot of a red panda" \
  --out out.png

python3 scripts/media_gen_client.py image \
  --model seedream-4-5-251128 \
  --prompt "Neo-noir detective portrait, film grain" \
  --size 2048x2048 \
  --out out.png

# Video — text-to-video
python3 scripts/media_gen_client.py video-create \
  --model wan2.7-t2v \
  --prompt "Sweeping shot of a neon cyberpunk skyline"

# Video — image-to-video on wan2.7-i2v
python3 scripts/media_gen_client.py video-create \
  --model wan2.7-i2v \
  --prompt "gentle zoom with parallax" \
  --img-url "https://example.com/reference.jpg" \
  --duration 5

# Wait and download
python3 scripts/media_gen_client.py video-wait \
  --task-id <task_id> --download --out out.mp4

Image generation — endpoint reference

Gemini family → `POST /v1/models/{model}:generateContent`

Documentation: Google Gemini Chat.

curl -X POST "https://api.aisa.one/v1/models/gemini-3-pro-image-preview:generateContent" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents":[
      {"role":"user","parts":[{"text":"A cute red panda, cinematic lighting"}]}
    ]
  }'

Response contains candidates[].parts[].inline_data with {mime_type, data}, where data is a base64 PNG.

Wan 2.7 family → `POST /v1/chat/completions`

Documentation: Image Generation via Chat.

Critical rule: messages[].content must be an array of typed parts. A plain string returns HTTP 400 invalid_parameter_error.

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan2.7-image",
    "messages": [
      {"role":"user","content":[
        {"type":"text","text":"A cute red panda, ultra-detailed, cinematic lighting"}
      ]}
    ],
    "n": 1
  }'

Images come back as {type: "image", image: "<url>"} parts inside choices[].message.content[].

Seedream → `POST /v1/images/generations`

Documentation: OpenAI-Compatible Image Generations.

curl -X POST "https://api.aisa.one/v1/images/generations" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedream-4-5-251128",
    "prompt": "A cute red panda, ultra-detailed, cinematic lighting",
    "n": 1,
    "size": "2048x2048"
  }'

Response: data[].url or data[].b64_json. Upstream enforces a minimum of 3,686,400 pixels. 1024×1024 and 1536×1536 are rejected. Any aspect ratio works as long as width × height ≥ 3,686,400.

Video generation — endpoint reference

Create task → `POST /apis/v1/services/aigc/video-generation/video-synthesis`

Documentation: Create video generation task. Header X-DashScope-Async: enable is required.

# wan2.6-t2v — text-to-video
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model":"wan2.6-t2v",
    "input":{"prompt":"cinematic close-up, slow push-in"},
    "parameters":{"resolution":"720P","duration":5}
  }'

# wan2.7-i2v — image-to-video (input.media, not input.img_url)
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model":"wan2.7-i2v",
    "input":{
      "prompt":"gentle zoom with parallax",
      "media":["https://example.com/reference.jpg"]
    },
    "parameters":{"resolution":"720P","duration":5}
  }'

Poll task → `GET /apis/v1/services/aigc/tasks/{task_id}`

Documentation: Get video generation task result.

task_id is a path parameter. The query-string form ?task_id=... returns HTTP 500 unsupported uri.

curl "https://api.aisa.one/apis/v1/services/aigc/tasks/YOUR_TASK_ID" \
  -H "Authorization: Bearer $AISA_API_KEY"

Python client

The bundled client at scripts/media_gen_client.py auto-routes each image model to the correct endpoint and normalizes the response to a saved file.

# Image — model selects the endpoint
python3 scripts/media_gen_client.py image \
  --model <gemini-3-pro-image-preview | wan2.7-image | wan2.7-image-pro | seedream-4-5-251128> \
  --prompt "..." \
  --out out.png

# Video — create task
python3 scripts/media_gen_client.py video-create \
  --model <wan2.6-t2v | wan2.6-i2v | wan2.7-t2v | wan2.7-i2v> \
  --prompt "..." \
  [--img-url https://... (required for -i2v models)] \
  [--duration 5|10] \
  [--resolution 720P|1080P]

# Video — poll / wait / download
python3 scripts/media_gen_client.py video-status --task-id <id>
python3 scripts/media_gen_client.py video-wait --task-id <id> --poll 10 --timeout 600
python3 scripts/media_gen_client.py video-wait --task-id <id> --download --out out.mp4

API reference

This skill calls the following AIsa endpoints directly:

Google Gemini Chat — generateContent — Gemini image models
Image Generation via Chat — Wan 2.7 image family
OpenAI-Compatible Image Generations — Seedream
Create video generation task — all four Wan video variants
Get video generation task result — async polling

See the full AIsa API Reference for the complete catalog.

License

MIT — see LICENSE at the repo root.

media-gen

Safety Notice

Copy this and send it to your AI assistant to learn

Media Gen 🎬

Use when

Compatibility

What you can do

Image — Gemini (base64 inline)

Image — Wan 2.7 (URL in chat response)

Image — Seedream (OpenAI-compatible, large format)

Video — text-to-video (Wan t2v)

Video — image-to-video (Wan i2v)

Supported models

Image generation — 4 models, 3 endpoints

Video generation — 4 Wan variants, 1 endpoint

Quick start

Image generation — endpoint reference

Gemini family → `POST /v1/models/{model}:generateContent`

Wan 2.7 family → `POST /v1/chat/completions`

Seedream → `POST /v1/images/generations`

Video generation — endpoint reference

Create task → `POST /apis/v1/services/aigc/video-generation/video-synthesis`

Poll task → `GET /apis/v1/services/aigc/tasks/{task_id}`

Python client

API reference

License

Source Transparency

Related Skills

Helsinki Finland

Hasbro Games

Harpercollins

Household Inventory System

media-gen

Safety Notice

Copy this and send it to your AI assistant to learn

Media Gen 🎬

Use when

Compatibility

What you can do

Image — Gemini (base64 inline)

Image — Wan 2.7 (URL in chat response)

Image — Seedream (OpenAI-compatible, large format)

Video — text-to-video (Wan t2v)

Video — image-to-video (Wan i2v)

Supported models

Image generation — 4 models, 3 endpoints

Video generation — 4 Wan variants, 1 endpoint

Quick start

Image generation — endpoint reference

Gemini family → POST /v1/models/{model}:generateContent

Wan 2.7 family → POST /v1/chat/completions

Seedream → POST /v1/images/generations

Video generation — endpoint reference

Create task → POST /apis/v1/services/aigc/video-generation/video-synthesis

Poll task → GET /apis/v1/services/aigc/tasks/{task_id}

Python client

API reference

License

Source Transparency

Related Skills

Helsinki Finland

Hasbro Games

Harpercollins

Household Inventory System

Gemini family → `POST /v1/models/{model}:generateContent`

Wan 2.7 family → `POST /v1/chat/completions`

Seedream → `POST /v1/images/generations`

Create task → `POST /apis/v1/services/aigc/video-generation/video-synthesis`

Poll task → `GET /apis/v1/services/aigc/tasks/{task_id}`