Aliyun Speech Transcriber
Use this skill to turn externally accessible media URLs into transcript results.
Current scope
Current implementation focuses on DashScope file transcription using the paraformer-v2 model, aligned with the existing Java service pattern.
Required environment variables
ASR_DASHSCOPE_API_KEY
Fallback supported:
DASHSCOPE_API_KEY
Optional:
ALIYUN_SPEECH_MODEL- defaults toparaformer-v2ALIYUN_SPEECH_LANG_HINTS- defaults tozh,enALIYUN_SPEECH_POLL_SECONDS- defaults to5ALIYUN_SPEECH_TIMEOUT_SECONDS- defaults to1800
Inputs
Pass one or more externally accessible URLs:
node scripts/transcribe.js --file-url "https://example.com/audio.mp3"
Multiple files:
node scripts/transcribe.js --file-url "https://a.com/1.mp3" --file-url "https://a.com/2.mp3"
Output
The script returns JSON with:
successproviderenginetaskIdrequestIdresultstext
text is a best-effort plain-text extraction from the final JSON result.
Chaining from Qiniu
Typical workflow:
- Use
qiniu-uploadto upload a local file. - Prefer a signed private URL if the domain is not anonymously readable.
- Pass the returned URL into this skill.
Safety rules
- Never hardcode Aliyun credentials.
- Fail fast if
DASHSCOPE_API_KEYis missing. - Only send URLs the user intends to transcribe.