skill-ugc-pipeline

End-to-end AI UGC video pipeline. Product info → GPT-4o-mini script → ElevenLabs voiceover → Aurora talking head (fal-ai/creatify/aurora) → Kling 2.6 Pro product B-roll → Whisper-synced captions → UGC post-processing filter (grain + handheld shake on avatar, clean product shot) → final MP4. Full pipeline ~$1.75/video.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "skill-ugc-pipeline" with this command: npx skills add zero2ai-hub/skill-ugc-pipeline

skill-ugc-pipeline v1.2.0

Build your own MakeUGC pipeline. Direct API access — no $300/mo enterprise tier.

Default model: Aurora (fal-ai/creatify/aurora) — locked after A/B test.
Aurora produces significantly more realistic lip sync and narration vs alternatives.

What's new in v1.2.0

  • B-roll splicing — Kling 2.6 Pro image-to-video generates a cinematic product shot, spliced into the avatar video at a configurable timecode
  • UGC filter — grain + handheld shake applied to avatar segments ONLY; product B-roll stays clean and cinematic
  • Continuous audio across splice points (no audio gap)

Architecture

product info → [GPT-4o-mini] → script
                             → [ElevenLabs] → audio.mp3
avatar image + audio         → [fal.ai Aurora] → avatar.mp4
product image                → [Kling 2.6 Pro] → broll.mp4
avatar + broll + ugc-filter  → [ffmpeg] → final.mp4
audio.mp3                    → [OpenAI Whisper] → captions overlay

Full Pipeline (6 steps)

1. Script       GPT-4o-mini → spoken script
2. Voice        ElevenLabs → audio.mp3
3. Avatar       fal-ai/creatify/aurora → talking head MP4
4. B-roll       Kling 2.6 Pro image-to-video → product shot
5. Splice       ffmpeg: avatar(hook) + broll + avatar(resume) + continuous audio
6. Captions     Whisper word-level → overlay.py → final MP4
7. UGC filter   grain + handheld shake on avatar ONLY (product shot stays clean)

Quick Start — Full Pipeline

cd skill-ugc-pipeline
npm install

# Step 1-3: Script + voice + avatar
node scripts/generate.js \
  --product "Rain Cloud Humidifier" \
  --product-desc "USB cool mist humidifier. 300ml tank, LED glow, silent mode." \
  --avatar avatars/my_avatar.png \
  --output output/ad_raincloud.mp4

# Step 4-5: Add B-roll + UGC filter
node scripts/broll.js \
  --avatar-video output/ad_raincloud_aurora.mp4 \
  --audio output/ad_raincloud_audio.mp3 \
  --product-image https://example.com/product.jpg \
  --product-name "Rain Cloud Humidifier" \
  --splice-at 4.5 \
  --broll-duration 5 \
  --ugc-filter \
  --output output/final.mp4

# Step 6: Whisper captions
node scripts/transcribe_captions.js \
  --audio output/ad_raincloud_audio.mp3 \
  --video output/final.mp4 \
  --output output/final_captioned.mp4

Scripts

ScriptDescription
generate.jsMain pipeline: script → voice → Aurora talking head
broll.jsB-roll splice + optional UGC filter (grain + shake on avatar)
transcribe_captions.jsWhisper word-level caption overlay
aurora_only.jsGenerate Aurora talking head only (skip script/voice)
batch.jsRun pipeline for multiple products
product_in_hand.jsGenerate product-in-hand composite image

broll.js Options

FlagDefaultDescription
--avatar-videorequiredPath to Aurora talking head MP4
--audiorequiredOriginal voiceover MP3 (plays continuously)
--product-imagerequiredURL or local path to product image
--product-namerequiredProduct name (used in Kling prompt)
--splice-at4.5Seconds into avatar video where B-roll inserts
--broll-duration5B-roll duration: 5 or 10 seconds
--ugc-filteroffAdd grain + handheld shake to avatar (product stays clean)
--outputrequiredOutput MP4 path

Cost Estimate

StepModelCost/video
ScriptGPT-4o-mini~$0.01
VoiceElevenLabs~$0.05
AvatarAurora (fal.ai)~$1.00
B-rollKling 2.6 Pro (fal.ai)~$0.40
CaptionsWhisper API~$0.01
Total~$1.47–1.75

Requirements

  • FAL_KEY — fal.ai API key (Aurora + Kling)
  • ELEVENLABS_API_KEY — ElevenLabs API key
  • OPENAI_API_KEY — OpenAI API key (GPT-4o-mini + Whisper)
  • ffmpeg — for video splicing and UGC filter
  • uv — for Python caption overlay (via skill-tiktok-ads-video)

Avatar Requirements

  • Portrait photo, face visible (no heavy makeup or accessories that obscure mouth)
  • Resolution: 512×512 minimum, 1024×1024 recommended
  • File format: PNG or JPG

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Cclaw

Open-source comedy AI + video editing + poster generation. Create standup/sketch/manzai/scripts, edit videos via FFmpeg, and generate comedy posters via canv...

Registry SourceRecently Updated
General

Bird Recognition Tool | 鸟类识别工具

Identifies bird species in images/videos of target areas. Supports recognition of no less than 500 common bird species, supports customized model training, s...

Registry SourceRecently Updated
General

Image Amazon Product Image Suite

A professional product image generation skill purpose-built for the Amazon e-commerce platform. Outputs comply with Amazon's image guidelines while optimizin...

Registry SourceRecently Updated
General

SearchOnlineAssets

Online asset search tool: queries public stock libraries (Pixabay) for high-quality photos, illustrations, vectors and videos, returning result metadata and...

Registry SourceRecently Updated