ugc-manual

Generate lip-sync video from image + user's own audio recording. ✅ USE WHEN: - User provides their OWN audio file (voice recording) - Want to sync image to specific audio/voice - User recorded the script themselves - Need exact audio timing preserved ❌ DON'T USE WHEN: - User provides text script (not audio) → use veed-ugc - Need AI to generate the voice → use veed-ugc - Don't have audio file yet → use veed-ugc with script INPUT: Image + audio file (user's recording) OUTPUT: MP4 video with lip-sync to provided audio KEY DIFFERENCE: veed-ugc = script → AI voice → video ugc-manual = user audio → video (no voice generation)

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "ugc-manual" with this command: npx skills add PauldeLavallaz/ugc-manual

UGC-Manual

Generate lip-sync videos by combining an image with a custom audio file using ComfyDeploy's UGC-MANUAL workflow.

Overview

UGC-Manual takes:

  1. An image (person/character with visible face)
  2. An audio file (user's voice recording)

And produces a video where the person in the image lip-syncs to the audio.

API Details

Endpoint: https://api.comfydeploy.com/api/run/deployment/queue Deployment ID: 075ce7d3-81a6-4e3e-ab0e-7a25edf601b5

Required Inputs

InputDescriptionFormats
imageImage with a visible faceJPG, PNG
input_audioAudio file to lip-syncMP3, WAV, OGG

Usage

uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "path/to/image.jpg" \
  --audio "path/to/audio.mp3" \
  --output "output-video.mp4"

With URLs:

uv run ~/.clawdbot/skills/ugc-manual/scripts/generate.py \
  --image "https://example.com/image.jpg" \
  --audio "https://example.com/audio.mp3" \
  --output "result.mp4"

Workflow Integration

Typical Use Cases

  1. Custom voice recordings - User records their own audio via Telegram/WhatsApp
  2. Pre-generated TTS - Audio generated externally (ElevenLabs, etc.)
  3. Music/sound sync - Sync mouth movements to any audio

Example Pipeline

# 1. Convert Telegram voice message to MP3 (if needed)
ffmpeg -i voice.ogg -acodec libmp3lame -q:a 2 voice.mp3

# 2. Generate lip-sync video
uv run ugc-manual... --image face.jpg --audio voice.mp3 --output video.mp4

Difference from VEED-UGC

FeatureUGC-ManualVEED-UGC
Audio sourceUser providesGenerated from brief
ScriptN/AAuto-generated
VoiceUser's recordingElevenLabs TTS
Use caseCustom audioAutomated content

Notes

  • Image should have a clearly visible face (frontal or 3/4 view)
  • Audio quality affects output quality
  • Processing time: ~2-5 minutes depending on audio length
  • Audio auto-conversion: The script automatically converts any audio format (MP3, OGG, M4A, etc.) to WAV PCM 16-bit mono 48kHz before sending to FabricLipsync
  • Requires ffmpeg installed on the system

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Pantry

Pantry — a fast home management tool. Log anything, find it later, export when needed.

Registry SourceRecently Updated
General

Milestone

A focused utility tools tool built for Milestone. Log entries, review trends, and export reports — all locally.

Registry SourceRecently Updated
General

Dingtalk Connector Guide

钉钉机器人接入指南 - OpenClaw 连接钉钉完整教程。适合:中国企业用户、钉钉开发者。

Registry SourceRecently Updated