video-agent

AI content generation suite with 35+ models. Image generation, video creation, audio processing via FAL AI, Google Vertex AI, ElevenLabs. Pipeline orchestration and cost management.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "video-agent" with this command: npx skills add founderjourney/claude-skills/founderjourney-claude-skills-video-agent

Video Agent - AI Content Generation Suite

A comprehensive AI content generation package providing a unified interface across 35+ models for image, video, and audio creation.

When to Use This Skill

  • Text-to-image generation
  • Image-to-image transformations
  • Text-to-video creation
  • Image-to-video animation
  • Professional text-to-speech
  • Multi-step content pipelines
  • Batch content generation

Supported Providers

FAL AI

  • FLUX models (text-to-image)
  • Image transformations
  • Fast inference

Google Vertex AI

  • Imagen 4 (text-to-image)
  • Veo (text-to-video)
  • High quality outputs

ElevenLabs

  • 20+ voice options
  • Professional TTS
  • Multiple languages

OpenRouter

  • Access to various LLMs
  • Text generation
  • Content writing

Core Capabilities

Image Generation

Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic

Available Models:

  • FLUX Pro/Dev (FAL)
  • Imagen 4 (Google)
  • Stable Diffusion variants

Video Creation

Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p

Available Models:

  • Google Veo
  • MiniMax Hailuo
  • Kling

Image-to-Video

Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 seconds

Text-to-Speech

Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3

Voice Options:

  • Professional male/female
  • Casual conversational
  • Narrator styles
  • Multiple accents

Pipeline Orchestration

YAML Configuration

pipeline: product-demo
steps:
  - name: generate-logo
    type: image
    model: flux-pro
    prompt: "Modern tech logo for AI startup"

  - name: create-intro
    type: video
    model: veo
    prompt: "Logo animation reveal"

  - name: add-voiceover
    type: audio
    model: elevenlabs
    text: "Introducing the future of AI..."
    voice: professional-male

  - name: combine
    type: merge
    inputs: [create-intro, add-voiceover]

JSON Configuration

{
  "pipeline": "social-content",
  "parallel": true,
  "steps": [
    {
      "type": "image",
      "variants": 4,
      "prompt": "Product hero shot"
    }
  ]
}

Cost Management

Real-time Estimation

Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)

Estimated: $2.45

Budget Limits

budget:
  max_per_job: $5.00
  max_daily: $50.00
  alert_threshold: 80%

Performance Features

Parallel Execution

Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3x

Caching

  • Automatic prompt caching
  • Reuse similar generations
  • Reduce redundant API calls

CLI Commands

# Image generation
video-agent image "prompt" --model flux-pro --size 1024

# Video generation
video-agent video "prompt" --model veo --duration 5

# Audio generation
video-agent audio "text" --voice professional-female

# Pipeline execution
video-agent pipeline config.yaml

# Cost check
video-agent cost --estimate

Python API

from video_agent import ImageGenerator, VideoGenerator

# Generate image
img = ImageGenerator(model="flux-pro")
result = img.generate("sunset over mountains")

# Generate video
vid = VideoGenerator(model="veo")
result = vid.generate("timelapse of clouds")

Setup

1. Install Package

pip install video-agent-claude-skill

2. Configure API Keys

export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"

3. Verify Setup

video-agent status

Use Cases

  • Marketing: Product images, promo videos
  • Social Media: Content at scale
  • Education: Explainer videos, voiceovers
  • Prototyping: Visual concepts, mockups
  • Automation: Batch content pipelines

Credits

Created by donghaozhang. Licensed under MIT.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

firecrawl

No summary provided by upstream source.

Repository SourceNeeds Review
General

seo-geo-skills

No summary provided by upstream source.

Repository SourceNeeds Review
General

saas-business-logic-analyst

No summary provided by upstream source.

Repository SourceNeeds Review
General

obsidian-skills

No summary provided by upstream source.

Repository SourceNeeds Review