pronunciation-coach

Pronunciation coaching with real voice analysis using Azure Speech Services. Analyzes audio files for phoneme-level accuracy, fluency, prosody, and intonation scores.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pronunciation-coach" with this command: npx skills add crazybuffon/pronunciation-coach

Pronunciation Coach

Analyze spoken English pronunciation using Azure Speech Services and provide actionable coaching feedback.

Privacy Note: This skill reads local voice messages from ~/.openclaw/media/inbound/ and transmits them to Microsoft Azure Speech Services for processing.

Prerequisites

  • Azure Speech API Key: Set AZURE_SPEECH_KEY env var
  • Azure Speech Region: Set AZURE_SPEECH_REGION env var (e.g., southeastasia)
  • ffmpeg: Required for audio format conversion (must be on PATH)
  • Node.js: Required for report generation

Workflow

1. Receive Audio

Voice messages from Telegram are stored in ~/.openclaw/media/inbound/. Find the latest .ogg file matching the message timestamp.

ls -lt ~/.openclaw/media/inbound/*.ogg | head -5

2. Run Assessment

scripts/pronunciation-assess.sh <audio_file> "<reference_text>"
  • audio_file: Path to the voice message (ogg/wav/mp3/m4a)
  • reference_text: What the speaker intended to say (from transcript)
  • The script auto-converts any format to WAV 16kHz mono

3. Generate Report

Pipe the JSON output into the report generator:

scripts/pronunciation-assess.sh audio.ogg "reference text" | node scripts/pronunciation-report.js

The report includes:

  • Overall scores (Pronunciation, Accuracy, Fluency, Prosody, Completeness)
  • Word-by-word breakdown with per-phoneme scores
  • Problem sounds highlighted
  • Verdict with actionable next steps

4. Provide Coaching

After generating the report:

  1. Send the text report to the user (scores + word breakdown)
  2. Identify top 3 problem sounds from the phoneme scores
  3. Explain each problem — what the correct sound is and how to produce it
    • See references/phoneme-guide.md for phoneme descriptions and fixes
  4. Send a voice message (via TTS) demonstrating the correct pronunciation of problem words
  5. Assign practice — give the user specific sentences to re-record focusing on weak sounds

Coaching Tips

  • Scores ≥ 90: Excellent, minor polish
  • Scores 70-89: Good, targeted practice needed
  • Scores < 70: Needs focused drill on that specific sound
  • "Omission" errors mean the word wasn't detected — speaker may have been too quiet or mumbled
  • Prosody score < 85 suggests monotone delivery — coach on intonation rises/falls
  • Compare scores across multiple recordings to track improvement

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Mental Health Analysis Tool | 心理健康分析工具

Analyzes human mental health and psychological behavior, supports identifying common psychological problem tendencies through video analysis, and provides st...

Registry SourceRecently Updated
Research

Cg Paper Writing

Academic paper writing skill for 3D vision, computer graphics, CAD, and 3D understanding. Covers NeRF, 3DGS, SLAM, point cloud processing, 3D shape understan...

Registry SourceRecently Updated
Research

Wikipedia Publisher

Draft, review, de-risk, and publish Wikipedia or Wikidata content with a bias toward policy-safe workflow. Use when creating or editing encyclopedia articles...

Registry SourceRecently Updated
Research

3dgs Paper Reader

Read and summarize 3D Gaussian Splatting research papers. Extracts method architecture, core innovations, experimental results, and key findings from arXiv p...

Registry SourceRecently Updated