generator-from-music

Skip the learning curve of professional editing software. Describe what you want — create a video with visuals synced to this music track — and get music-synced videos back in 1-2 minutes. Upload MP3, WAV, AAC, FLAC files up to 200MB, and the AI handles AI video creation automatically. Ideal for musicians, content creators, marketers who want videos created automatically from their music without manual editing.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "generator-from-music" with this command: npx skills add tk8544-b/generator-from-music

Getting Started

Got audio files to work with? Send it over and tell me what you need — I'll take care of the AI video creation.

Try saying:

  • "generate a 3-minute MP3 track into a 1080p MP4"
  • "create a video with visuals synced to this music track"
  • "generating videos from music tracks for musicians, content creators, marketers"

First-Time Connection

When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").

Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.

  1. Obtain a free token: Generate a random UUID as client identifier. POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days.
  2. Create a session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.

Keep setup communication brief. Don't display raw API responses or token values to the user.

Generator From Music — Create Videos From Music Tracks

This tool takes your audio files and runs AI video creation through a cloud rendering pipeline. You upload, describe what you want, and download the result.

Say you have a 3-minute MP3 track and want to create a video with visuals synced to this music track — the backend processes it in about 1-2 minutes and hands you a 1080p MP4.

Tip: shorter tracks under 2 minutes produce faster and more visually consistent results.

Matching Input to Actions

User prompts referencing generator from music, aspect ratio, text overlays, or audio tracks get routed to the corresponding action via keyword and intent classification.

User says...ActionSkip SSE?
"export" / "导出" / "download" / "send me the video"→ §3.5 Export
"credits" / "积分" / "balance" / "余额"→ §3.3 Credits
"status" / "状态" / "show tracks"→ §3.4 State
"upload" / "上传" / user sends file→ §3.2 Upload
Everything else (generate, edit, add BGM…)→ §3.1 SSE

Cloud Render Pipeline Details

Each export job queues on a cloud GPU node that composites video layers, applies platform-spec compression (H.264, up to 1080x1920), and returns a download URL within 30-90 seconds. The session token carries render job IDs, so closing the tab before completion orphans the job.

Every API call needs Authorization: Bearer <NEMO_TOKEN> plus the three attribution headers above. If any header is missing, exports return 402.

Headers are derived from this file's YAML frontmatter. X-Skill-Source is generator-from-music, X-Skill-Version comes from the version field, and X-Skill-Platform is detected from the install path (~/.clawhub/ = clawhub, ~/.cursor/skills/ = cursor, otherwise unknown).

API base: https://mega-api-prod.nemovideo.ai

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: {"urls":["<url>"],"source_type":"url"}

Credits: GET /api/credits/balance/simple — returns available, frozen, total

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, data.state.generated_media

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

Error Codes

  • 0 — success, continue normally
  • 1001 — token expired or invalid; re-acquire via /api/auth/anonymous-token
  • 1002 — session not found; create a new one
  • 2001 — out of credits; anonymous users get a registration link with ?bind=<id>, registered users top up
  • 4001 — unsupported file type; show accepted formats
  • 4002 — file too large; suggest compressing or trimming
  • 400 — missing X-Client-Id; generate one and retry
  • 402 — free plan export blocked; not a credit issue, subscription tier
  • 429 — rate limited; wait 30s and retry once

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend saysYou do
"click [button]" / "点击"Execute via API
"open [panel]" / "打开"Query session state
"drag/drop" / "拖拽"Send edit via SSE
"preview in timeline"Show track summary
"Export button" / "导出"Execute export workflow

SSE Event Handling

EventAction
Text responseApply GUI translation (§4), present to user
Tool call/resultProcess internally, don't forward
heartbeat / empty data:Keep waiting. Every 2 min: "⏳ Still working..."
Stream closesProcess final response

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Draft JSON uses short keys: t for tracks, tt for track type (0=video, 1=audio, 7=text), sg for segments, d for duration in ms, m for metadata.

Example timeline summary:

Timeline (3 tracks): 1. Video: city timelapse (0-10s) 2. BGM: Lo-fi (0-10s, 35%) 3. Title: "Urban Dreams" (0-3s)

Tips and Tricks

The backend processes faster when you're specific. Instead of "make it look better", try "create a video with visuals synced to this music track" — concrete instructions get better results.

Max file size is 200MB. Stick to MP3, WAV, AAC, FLAC for the smoothest experience.

MP3 at 320kbps gives the best balance of audio quality and file size.

Common Workflows

Quick edit: Upload → "create a video with visuals synced to this music track" → Download MP4. Takes 1-2 minutes for a 30-second clip.

Batch style: Upload multiple files in one session. Process them one by one with different instructions. Each gets its own render.

Iterative: Start with a rough cut, preview the result, then refine. The session keeps your timeline state so you can keep tweaking.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Brave Api Search

Real-time web search, autosuggest, and AI-powered answers using the official Brave Search API. Use for searching documentation, facts, current events, or any...

Registry SourceRecently Updated
00Profile unavailable
General

Mapick

Mapick — Skill recommendation & privacy protection for OpenClaw. Scans your local skills, suggests what you're missing, and keeps other skills from seeing yo...

Registry SourceRecently Updated
General

Basf Se

全球领先化工企业,采用独特Verbund一体化生产,业务涵盖化学品、材料、涂料、营养和农业解决方案。

Registry SourceRecently Updated
General

Recharge / Renew of Skills | 技能账户充值/续费

生命涌现支付技能, 提供技能账户的增值服务(充值/续费), 提供技能账户查询(余额/使用次数)、余额检查、充值流程管理、支付宝支付集成、支付成功/失败回调通知云端I。当用户需要查询账户状态、检查余额、进行充值续费、或处理支付流程时使用此技能。**自动触发关键词**:技能充值、技能续费、技能余额不足、技能账户、充值...

Registry SourceRecently Updated