openclaw-media-gen

Generate images and videos with AIsa. Four image models (Google Gemini 3 Pro Image, Alibaba Wan 2.7 image + image-pro, ByteDance Seedream) and four Wan video variants (wan2.6/2.7 × t2v/i2v). One API key; the client routes each model to the correct endpoint automatically. Use when: the user needs AI image or video generation workflows.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openclaw-media-gen" with this command: npx skills add baofeng-tech/openclaw-media-gen-aisa

Media Gen 🎬

Generate images and videos with a single AIsa API key. Full support for every image and video model AIsa routes through its Unified LLM Gateway, across three different endpoint paths.

Compatibility

Works with any agentskills.io-compatible harness, including:

  • Claude Code and Claude (Anthropic)
  • OpenAI Codex
  • Cursor
  • Gemini CLI (Google)
  • OpenCode, Goose, OpenClaw, Hermes
  • and any other harness that implements the Agent Skills specification

Requires Python 3, a POSIX shell, and AISA_API_KEY (get one at aisa.one).

🔥 What You Can Do

Image — Gemini (base64 inline)

"Generate a cyberpunk-style city nightscape, neon lights, rainy night, cinematic feel"

Image — Wan 2.7 (URL in chat response)

"Generate an ultra-detailed product shot of a red panda, studio lighting, sharp focus"

Image — Seedream (OpenAI-compatible, large format)

"Generate a 2048×2048 magazine cover: neo-noir detective portrait, film grain"

Video — text-to-video (Wan t2v)

"Sweeping establishing shot of a neon cyberpunk skyline at dusk, 5 seconds"

Video — image-to-video (Wan i2v)

"Starting from this reference image, gentle camera push-in with parallax"

Supported Models

Image generation — 4 models, 3 endpoints

ModelDeveloperEndpointNotes
gemini-3-pro-image-previewGooglePOST /v1/models/{model}:generateContentImages returned as base64 in candidates[].parts[].inline_data
wan2.7-imageAlibabaPOST /v1/chat/completionsImages returned as URL parts in choices[].message.content[] (type=image). $0.030/image
wan2.7-image-proAlibabaPOST /v1/chat/completionsHigher fidelity. $0.075/image
seedream-4-5-251128ByteDancePOST /v1/images/generationsOpenAI-compatible. Minimum 3,686,400 pixels (e.g. 1920×1920). $0.040/image

Video generation — 4 Wan variants, 1 endpoint

ModelKindImage fieldOutput SR
wan2.6-t2vtext-to-videonone1080
wan2.6-i2vimage-to-videoinput.img_url (string)720
wan2.7-t2vtext-to-videonone720
wan2.7-i2vimage-to-videoinput.media (array) ⚠720

Schema trap on wan2.7-i2v. It takes the reference image in input.media (array of URLs), not input.img_url like wan2.6-i2v. Submissions without media return HTTP 200 with a task_id, then fail downstream with InvalidParameter: Field required: input.media. The bundled client routes this automatically — just pass --img-url and pick the model.

Quick Start

export AISA_API_KEY="your-key"

# Any image model — client routes to the right endpoint
python3 scripts/media_gen_client.py image \
  --model gemini-3-pro-image-preview \
  --prompt "A cute red panda, cinematic lighting" \
  --out out.png

python3 scripts/media_gen_client.py image \
  --model wan2.7-image-pro \
  --prompt "Ultra-detailed product shot of a red panda" \
  --out out.png

python3 scripts/media_gen_client.py image \
  --model seedream-4-5-251128 \
  --prompt "Neo-noir detective portrait, film grain" \
  --size 2048x2048 \
  --out out.png

# Video — text-to-video (no image needed)
python3 scripts/media_gen_client.py video-create \
  --model wan2.7-t2v \
  --prompt "Sweeping shot of a neon cyberpunk skyline"

# Video — image-to-video on wan2.7-i2v (client routes to input.media[])
python3 scripts/media_gen_client.py video-create \
  --model wan2.7-i2v \
  --prompt "gentle zoom with parallax" \
  --img-url "https://example.com/reference.jpg" \
  --duration 5

# Wait and download
python3 scripts/media_gen_client.py video-wait \
  --task-id <task_id> --download --out out.mp4

🖼️ Image Generation — endpoint reference

Gemini family → POST /v1/models/{model}:generateContent

Documentation: Google Gemini Chat.

curl -X POST "https://api.aisa.one/v1/models/gemini-3-pro-image-preview:generateContent" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents":[
      {"role":"user","parts":[{"text":"A cute red panda, cinematic lighting"}]}
    ]
  }'

Response contains candidates[].parts[].inline_data with {mime_type, data} where data is a base64 PNG.

Wan 2.7 family → POST /v1/chat/completions

Documentation: Image Generation via Chat.

Critical rule: messages[].content must be an array of typed parts. A plain string returns HTTP 400 invalid_parameter_error.

curl -X POST "https://api.aisa.one/v1/chat/completions" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan2.7-image",
    "messages": [
      {"role":"user","content":[
        {"type":"text","text":"A cute red panda, ultra-detailed, cinematic lighting"}
      ]}
    ],
    "n": 1
  }'

Images come back as {type: "image", image: "<url>"} parts inside choices[].message.content[].

Seedream → POST /v1/images/generations

Documentation: OpenAI-Compatible Image Generations.

curl -X POST "https://api.aisa.one/v1/images/generations" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "seedream-4-5-251128",
    "prompt": "A cute red panda, ultra-detailed, cinematic lighting",
    "n": 1,
    "size": "2048x2048"
  }'

Response: data[].url or data[].b64_json. Upstream enforces a minimum of 3,686,400 pixels. 1024×1024 and 1536×1536 get rejected. Any aspect ratio works as long as width × height ≥ 3,686,400.


🎞️ Video Generation — endpoint reference

Create task → POST /apis/v1/services/aigc/video-generation/video-synthesis

Documentation: Create video generation task. Header X-DashScope-Async: enable is required.

# wan2.6-t2v — text-to-video
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model":"wan2.6-t2v",
    "input":{"prompt":"cinematic close-up, slow push-in"},
    "parameters":{"resolution":"720P","duration":5}
  }'

# wan2.7-i2v — image-to-video (⚠ input.media not input.img_url)
curl -X POST "https://api.aisa.one/apis/v1/services/aigc/video-generation/video-synthesis" \
  -H "Authorization: Bearer $AISA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-Async: enable" \
  -d '{
    "model":"wan2.7-i2v",
    "input":{
      "prompt":"gentle zoom with parallax",
      "media":["https://example.com/reference.jpg"]
    },
    "parameters":{"resolution":"720P","duration":5}
  }'

Poll task → GET /apis/v1/services/aigc/tasks/{task_id}

Documentation: Get video generation task result.

task_id is a path parameter. The query-string form ?task_id=... returns HTTP 500 unsupported uri.

curl "https://api.aisa.one/apis/v1/services/aigc/tasks/YOUR_TASK_ID" \
  -H "Authorization: Bearer $AISA_API_KEY"

Python Client

The bundled client at scripts/media_gen_client.py auto-routes each image model to the correct endpoint and normalizes the response to a saved file.

# Image — model picks the endpoint
python3 scripts/media_gen_client.py image \
  --model <gemini-3-pro-image-preview | wan2.7-image | wan2.7-image-pro | seedream-4-5-251128> \
  --prompt "..." \
  --out out.png

# Video — create task
python3 scripts/media_gen_client.py video-create \
  --model <wan2.6-t2v | wan2.6-i2v | wan2.7-t2v | wan2.7-i2v> \
  --prompt "..." \
  [--img-url https://... (required for -i2v models)] \
  [--duration 5|10] \
  [--resolution 720P|1080P]

# Video — poll / wait / download
python3 scripts/media_gen_client.py video-status --task-id <id>
python3 scripts/media_gen_client.py video-wait --task-id <id> --poll 10 --timeout 600
python3 scripts/media_gen_client.py video-wait --task-id <id> --download --out out.mp4

API Reference

This skill calls the following AIsa endpoints directly:

See the full AIsa API Reference for the complete catalog.

License

MIT — see LICENSE at the repo root.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Helsinki Finland

赫尔辛基是芬兰首都,拥有顶尖教育体系、全球设计中心及蓬勃的科技创业生态,人口约65万,GDP约400亿欧元。

Registry SourceRecently Updated
General

Hasbro Games

Provides insights into Hasbro's brand empire, IP management, business model, and digital transformation in the global toy and entertainment industry.

Registry SourceRecently Updated
General

Harpercollins

全球知名出版商,拥有200年历史,旗下20+品牌,出版多种类图书并积极推进数字出版与版权保护。

Registry SourceRecently Updated
General

Household Inventory System

Track non-food household supplies, set minimum/maximum stock levels, design restock triggers, and organize shopping lists and storage locations efficiently.

Registry SourceRecently Updated