Video Intent Studio
Follow a user-led workflow for video generation. Do not choose the final prompt for the user. Suggest options, keep the current prompt transparent, and ask for confirmation before generation.
Core behavior
- Respond in the user's language. Default to Chinese if the user is writing in Chinese.
- Keep the workflow layered: intent -> ranked video types -> parameter tuning -> prompt confirmation -> generation.
- Present 3 to 5 options at the type-selection step. Keep the option set stable and only reorder by relevance.
- Show the current prompt preview whenever parameters change.
- Let the user revise or exit at every stage.
- Keep a simple state object in memory or scratch space:
state = {
"user_input": "",
"selected_type": None,
"params": {
"duration": 8,
"ratio": "16:9",
"motion": "medium",
"style": "original",
"brightness": "normal",
"subtitle": "off",
"dream_filter": "off",
},
"final_prompt": "",
"stage": "init",
}
Skill directory and script paths
When this SKILL.md is loaded, resolve the skill directory from the absolute path of this file.
- Suggestion and prompt builder:
<skill-dir>/scripts/video_agent_backend.py
- Final generator:
<skill-dir>/scripts/generate_ark_video.py
- Type catalog and defaults:
<skill-dir>/references/video-types.md
- Usage walkthrough and examples:
<skill-dir>/references/usage-guide.md
Use absolute paths when running the scripts.
Recommended workflow
-
Capture the user's raw idea.
- Keep the original wording in state as
user_input. - Do not rewrite it into a final prompt yet.
- Keep the original wording in state as
-
Rank video types.
- Run:
python "<skill-dir>/scripts/video_agent_backend.py" suggest --input "user idea"
- Present the top 3 to 5 results as numbered options.
- For each option, include:
- type name
- one-sentence use case
- default duration and ratio
- short reason why it matches
- After the user chooses a type, build a prompt preview.
- Run:
python "<skill-dir>/scripts/video_agent_backend.py" build ^
--input "user idea" ^
--type cinematic-story
- Show:
- current prompt preview
- current parameter summary
- a short numbered list of tunable options
-
If the user adjusts settings, rerun
buildwith explicit parameters.- Supported parameters:
--duration 5|8|10|12--ratio 9:16|16:9|1:1|4:3--motion light|medium|strong--style realistic|anime|cinematic|original--brightness moody|normal|bright--subtitle off|on--dream-filter off|on--notes "extra user constraint"
- Supported parameters:
-
Before generation, show the final prompt and ask for confirmation.
- Use a short confirmation question such as:
- "Final prompt and parameters are ready. Generate now?"
- Use a short confirmation question such as:
-
After explicit confirmation, generate the video.
- Run:
python "<skill-dir>/scripts/generate_ark_video.py" ^
--prompt "final prompt text" ^
--output "C:\path\to\result.mp4"
- Report success or failure clearly.
- On success, give the downloaded file path and task id if available.
- On failure, bucket the issue into one of:
- API key or auth problem
- network or polling problem
- task failed remotely
- response did not include a downloadable video URL
Important implementation notes
- The bundled generator script intentionally mirrors the existing HTTP + polling pattern already used in this workspace.
- The current API request sends a text prompt payload only. Duration, ratio, motion, style, and other controls are encoded into the prompt text unless you later extend the API payload.
- The generator script reads credentials from environment variables first:
ARK_API_KEYVOLCENGINE_ARK_API_KEY
- Optional environment variables:
ARK_VIDEO_MODELARK_VIDEO_TASKS_URL
Conversation rules
- Do not skip the type-selection step unless the user explicitly says they already know the type.
- Do not ask broad open-ended questions if a numbered choice is possible.
- If the user says "more realistic", "more cinematic", "shorter", "vertical", or similar, treat that as a parameter update and keep moving.
- If the user says "generate", "go", or "就这样生成", show the final prompt once and ask for one explicit confirmation unless they already confirmed in the same message.
When to read references
- Read references/video-types.md when you need the fixed type list, defaults, or category-specific prompt leads.
- Read references/usage-guide.md when you need example conversations, sample commands, or the user-facing tutorial flow.