Agent Media
Agent Media is an agent-first media toolkit that provides CLI-accessible commands for image, video, and audio processing. All commands produce deterministic, machine-readable JSON output.
Available Commands
Image Commands
npx agent-media@latest image resize- Resize an imagenpx agent-media@latest image convert- Convert image formatnpx agent-media@latest image generate- Generate image from textnpx agent-media@latest image edit- Edit one or more images with text promptnpx agent-media@latest image remove-background- Remove image backgroundnpx agent-media@latest image upscale- Upscale image with AI super-resolutionnpx agent-media@latest image extend- Extend image canvas with paddingnpx agent-media@latest image crop- Crop image to dimensions around focal point
Audio Commands
npx agent-media@latest audio extract- Extract audio from videonpx agent-media@latest audio transcribe- Transcribe audio to text
Video Commands
npx agent-media@latest video generate- Generate video from text or image
Output Format
All commands return JSON to stdout:
{
"ok": true,
"media_type": "image",
"action": "resize",
"provider": "local",
"output_path": "output_123.webp",
"mime": "image/webp",
"bytes": 12345
}
On error:
{
"ok": false,
"error": {
"code": "INVALID_INPUT",
"message": "input file not found"
}
}
Providers
- local - Default provider using Sharp (resize, convert, extend, crop) and Transformers.js (remove-background, upscale, transcribe)
- fal - fal.ai provider (generate, edit, remove-background, upscale, transcribe, video)
- replicate - Replicate API (generate, edit, remove-background, upscale, transcribe, video)
- runpod - Runpod API (generate, edit, video)
- ai-gateway - Vercel AI Gateway (generate, edit)
Provider Selection
- Explicit:
--provider <name> - Auto-detect from environment variables
- Fallback to local provider
Environment Variables
AGENT_MEDIA_DIR- Custom output directoryFAL_API_KEY- Enable fal providerREPLICATE_API_TOKEN- Enable replicate providerRUNPOD_API_KEY- Enable runpod providerAI_GATEWAY_API_KEY- Enable ai-gateway provider