human-avatar

使用阿里云 DashScope/灵眸 API 生成人脸口播视频(talking head video)。支持三种模式:EMO(人像+音频驱动口播,两步流程)、AA/Animate Anyone(全身动画)、灵眸(基于模板的数字人口播视频)。当用户需要制作口播视频、数字人视频、EMO/AA 人脸动画、VideoRetalk 视频换人时触发此技能。

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "human-avatar" with this command: npx skills add davideuler/human-avatar

Human Avatar — 阿里云口播视频生成

三种模式

模式接口认证Region说明
EMODashScopeDASHSCOPE_API_KEYcn-beijing人像+音频→口播,需先 detect
AA (Animate Anyone)DashScopeDASHSCOPE_API_KEYcn-beijing人像+动作视频→全身动画
灵眸 (LingMou)独立产品 SDKAK/SKcn-beijing基于模板的数字人口播
VideoRetalk (视频换人)DashScopeDASHSCOPE_API_KEYcn-beijing视频角色替换

⚠️ Region 固定为 cn-beijing,API Key 需在北京地域开通,不可与新加坡 Key 混用。

前置条件

pip install requests dashscope oss2
# 灵眸额外需要:
pip install alibabacloud-lingmou20250527 alibabacloud-tea-openapi

环境变量:

export DASHSCOPE_API_KEY=sk-xxxx          # DashScope API Key(百炼控制台获取)
export ALIBABA_CLOUD_ACCESS_KEY_ID=xxx    # 灵眸用
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=xxx
export OSS_BUCKET=xxx                     # 本地文件上传用
export OSS_ENDPOINT=oss-cn-beijing.aliyuncs.com

EMO 工作流(两步)

Step 1: emo-detect-v1 检测人脸 → 获取 face_bbox, ext_bbox
  ↓
Step 2: emo-v1 提交生成 → task_id
  ↓
轮询 GET /api/v1/tasks/{task_id} → SUCCEEDED → video_url
python scripts/portrait_animate.py \
  --image-url "https://example.com/portrait.jpg" \
  --audio-url "https://example.com/speech.mp3" \
  --download

灵眸工作流(基于模板)

1. 查询模板列表 → templateId(已存 digital_human_template.json)
2. CreateBroadcastVideoFromTemplate (variables 替换 text_content)
3. 轮询 ListBroadcastVideosById → SUCCESS → videoURL
python scripts/avatar_video.py \
  --template-id "BS1b2WNnRMu4ouRzT4clY9Jhg" \
  --text "大家好,欢迎收看今天的科技新闻。" \
  --download

一键 Demo Pipeline

# EMO
python scripts/demo_pipeline.py --mode emo --image ./face.jpg --audio ./speech.mp3 --download

# AA
python scripts/demo_pipeline.py --mode aa --model <AA_MODEL_NAME> --image-url https://... --video-url https://... --download

# 灵眸
python scripts/demo_pipeline.py --mode lingmou --template-id BSxxxx --text "大家好" --download

API 参考

图片/音频要求

参数EMO 要求
图片格式jpg/jpeg/png/bmp/webp
图片分辨率最小边 ≥400px,最大边 ≤7000px
单人正面必须,面部完整无遮挡
音频格式mp3/wav
音频大小≤15MB
音频时长≤60s,需清晰人声(去背景噪音)
URL 类型必须是公网 HTTP/HTTPS(不支持 file:// 本地路径)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Hippo Video

Hippo Video integration. Manage Persons, Organizations, Deals, Leads, Activities, Notes and more. Use when the user wants to interact with Hippo Video data.

Registry SourceRecently Updated
General

币安资金费率监控

币安资金费率套利监控工具 - 查看账户、持仓、盈亏统计,SkillPay收费版

Registry SourceRecently Updated
General

apix

Use `apix` to search, browse, and execute API endpoints from local markdown vaults. Use this skill to discover REST API endpoints, inspect request/response s...

Registry SourceRecently Updated
0160
dngpng