AI Video Generation Skill

Generate complete videos from text descriptions using AI.

Capabilities

Image Generation - DALL-E 3, Stable Diffusion, Flux
Video Generation - LumaAI, Runway, Replicate models
Voice-over - OpenAI TTS, ElevenLabs
Video Editing - FFmpeg assembly, transitions, overlays

Quick Start

# Generate a complete video
python skills/ai-video-gen/generate_video.py --prompt "A sunset over mountains" --output sunset.mp4

# Just images to video
python skills/ai-video-gen/images_to_video.py --images img1.png img2.png --output result.mp4

# Add voiceover
python skills/ai-video-gen/add_voiceover.py --video input.mp4 --text "Your narration" --output final.mp4

Setup

Required API Keys

Add to your environment or .env file:

# Image Generation (pick one)
OPENAI_API_KEY=sk-...              # DALL-E 3
REPLICATE_API_TOKEN=r8_...         # Stable Diffusion, Flux

# Video Generation (pick one)
LUMAAI_API_KEY=luma_...           # LumaAI Dream Machine
RUNWAY_API_KEY=...                # Runway ML
REPLICATE_API_TOKEN=r8_...        # Multiple models

# Voice (optional)
OPENAI_API_KEY=sk-...             # OpenAI TTS
ELEVENLABS_API_KEY=...            # ElevenLabs

# Or use FREE local options (no API needed)

Install Dependencies

pip install openai requests pillow replicate python-dotenv

FFmpeg

Already installed via winget.

Usage Examples

1. Text to Video (Full Pipeline)

python skills/ai-video-gen/generate_video.py \
  --prompt "A futuristic city at night with flying cars" \
  --duration 5 \
  --voiceover "Welcome to the future" \
  --output future_city.mp4

2. Multiple Scenes

python skills/ai-video-gen/multi_scene.py \
  --scenes "Morning sunrise" "Busy city street" "Peaceful night" \
  --duration 3 \
  --output day_in_life.mp4

3. Image Sequence to Video

python skills/ai-video-gen/images_to_video.py \
  --images frame1.png frame2.png frame3.png \
  --fps 24 \
  --output animation.mp4

Workflow Options

Budget Mode (FREE)

Image: Stable Diffusion (local or free API)
Video: Open source models
Voice: OpenAI TTS (cheap) or free TTS
Edit: FFmpeg

Quality Mode (Paid)

Image: DALL-E 3 or Midjourney
Video: Runway Gen-3 or LumaAI
Voice: ElevenLabs
Edit: FFmpeg + effects

Scripts Reference

generate_video.py - Main end-to-end generator
images_to_video.py - Convert image sequence to video
add_voiceover.py - Add narration to existing video
multi_scene.py - Create multi-scene videos
edit_video.py - Apply effects, transitions, overlays

API Cost Estimates

DALL-E 3: ~$0.04-0.08 per image
Replicate: ~$0.01-0.10 per generation
LumaAI: $0-0.50 per 5sec (free tier available)
Runway: ~$0.05 per second
OpenAI TTS: ~$0.015 per 1K characters
ElevenLabs: ~$0.30 per 1K characters (better quality)

Examples

See examples/ folder for sample outputs and prompts.

ai-video-gen-local

Safety Notice

Copy this and send it to your AI assistant to learn

AI Video Generation Skill

Capabilities

Quick Start

Setup

Required API Keys

Install Dependencies

FFmpeg

Usage Examples

1. Text to Video (Full Pipeline)

2. Multiple Scenes

3. Image Sequence to Video

Workflow Options

Budget Mode (FREE)

Quality Mode (Paid)

Scripts Reference

API Cost Estimates

Examples

Source Transparency

Related Skills

通义晓蜜 - 智能外呼

Letterboxd Watchlist

Seedance Video Generation

Universal Skills Manager