senseaudio-asr

Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1/audio/transcriptions`), audio quality analysis (`/v1/audio/analysis`), and recognition record queries (`/v1/audio/records`). Use this whenever user asks for speech-to-text, diarization, translation, streaming ASR, or ASR model/parameter selection.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "senseaudio-asr" with this command: npx skills add scikkk/senseaudio-asr

SenseAudio ASR

Use this skill for all SenseAudio speech recognition tasks.

Credential source: read the API key from SENSEAUDIO_API_KEY and send it only in the Authorization: Bearer ... header. Do not place API keys in query parameters, logs, transcripts, or saved examples.

Read First

  • references/asr.md

Workflow

  1. Pick recognition mode:
  • HTTP file transcription for offline audio.
  • WebSocket for realtime streaming microphone/audio chunks.
  • Audio analysis for noise and quality checks before recognition.
  • Records query for recent recognition history lookup.
  1. Choose model by feature needs:
  • Lite for low-cost basic transcription.
  • ASR for streaming, translation, diarization, sentiment, and timestamps.
  • Pro when diarization plus explicit max_speakers control is needed.
  • DeepThink for streaming, translation, and intelligent editing; do not send language, diarization, sentiment, timestamps, ITN, or punctuation controls.
  1. Build minimal request:
  • Required auth, file/audio format, model.
  • Add optional controls only when needed.
  • Keep uploaded files at or below 10MB; split longer audio before sending.
  1. Validate compatibility:
  • Check model-parameter support before sending.
  • Enforce WS pcm / 16000Hz / mono requirements.
  • For HTTP stream=true, expect SSE text deltas only, not structured verbose fields.
  1. Parse robustly:
  • Handle JSON/text/verbose/SSE forms.
  • Handle WS terminal events and failures.
  • Treat returned audio URLs, api_key, session_id, and trace_id as sensitive operational data.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Content Refresher

Use when updating outdated content, fixing traffic/ranking decay, refreshing stats, adding new sections, or improving freshness signals. 内容更新/排名恢复

Registry SourceRecently Updated
General

AssemblyAI Transcriber

Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.

Registry SourceRecently Updated
General

mac-node-snapshot

A robust, permission-friendly method to capture macOS screens via OpenClaw screen.record. Ideal for headless environments or ensuring capture reliability.

Registry SourceRecently Updated
1.4K0taozhe6
General

Amazon Asin Lookup Api Skill

This skill helps users extract structured product details from Amazon using a specific ASIN (Amazon Standard Identification Number). Use this skill when the...

Registry SourceRecently Updated
1.3K1phheng