avatar-video

Avatar Video

Create AI avatar videos with full control over avatars, voices, scripts, and backgrounds using POST /v3/videos . Two creation modes via discriminated union on type :

"type": "avatar"

avatar_id — use a HeyGen avatar from the library

"type": "image"

image (AssetInput) — animate any photo via Avatar IV

Authentication

All requests require the X-Api-Key header. Set the HEYGEN_API_KEY environment variable.

curl -X GET "https://api.heygen.com/v3/avatars"
-H "X-Api-Key: $HEYGEN_API_KEY"

Tool Selection

If HeyGen MCP tools are available (mcp__heygen__* ), prefer them over direct HTTP API calls — they handle authentication and request formatting automatically.

Task MCP Tool Fallback (Direct API)

Check video status / get URL mcp__heygen__get_video

GET /v3/videos/{video_id}

List account videos mcp__heygen__list_videos

GET /v3/videos

Delete a video mcp__heygen__delete_video

DELETE /v3/videos/{video_id}

Video generation (POST /v3/videos ) and avatar/voice listing are done via direct API calls — see reference files below.

Default Workflow

List avatar looks — GET /v3/avatars/looks → pick a look, note its id (this is the avatar_id ) and default_voice_id . See avatars.md
List voices (if needed) — GET /v3/voices → pick a voice matching the avatar's gender/language. See voices.md
Write the script — Structure scenes with one concept each. See scripts.md
Generate the video — POST /v3/videos with avatar_id , voice_id , script , and optional background per scene. See video-generation.md
Poll for completion — GET /v3/videos/{video_id} until status is completed . See video-status.md

Routing: This Skill vs Create Video

This skill = precise control (specific avatar, exact script, custom background). create-video = prompt-based ("make me a video about X", AI handles the rest).

Reference Files

Read these as needed — they contain endpoint details, request/response schemas, and code examples (curl, TypeScript, Python).

Core workflow:

references/video-generation.md — POST /v3/videos request fields, avatar input modes, voice settings, backgrounds
references/avatars.md — GET /v3/avatars (groups) and GET /v3/avatars/looks (looks → avatar_id )
references/voices.md — GET /v3/voices with filtering by language, gender, engine
references/video-status.md — GET /v3/videos/{id} polling patterns and download

Customization:

references/scripts.md — Script writing, SSML break tags, pacing
references/backgrounds.md — Solid color and image backgrounds
references/captions.md — Auto-generated captions/subtitles
references/text-overlays.md — Text overlays with fonts and positioning

Advanced:

references/photo-avatars.md — Animate photos via type: "image" (Avatar IV), AI-generated avatars
references/templates.md — Template listing and variable replacement
references/remotion-integration.md — Using HeyGen avatars in Remotion compositions
references/webhooks.md — Webhook endpoints and events
references/assets.md — Uploading images, videos, audio
references/dimensions.md — Resolution and aspect ratios
references/quota.md — Credit system and usage limits

Best Practices

Preview avatars before generating — Use GET /v3/avatars/looks and download preview_image_url so the user can see the avatar before committing
Use avatar's default voice — Most avatars have a default_voice_id pre-matched for natural results
Fallback: match gender manually — If no default voice, ensure avatar and voice genders match
Use test mode for development — Set test: true to avoid consuming credits (output will be watermarked)
Set generous timeouts — Video generation often takes 5-15 minutes, sometimes longer
Validate inputs — Check avatar and voice IDs exist before generating

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

heygen

text-to-speech

video-translate

video-understand