edge-tts

Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rather than read (multitasking, accessibility, driving, cooking). (3) User wants a specific voice, speed, pitch, or format for TTS output.

Safety Notice

This item is sourced from the public archived skills repository. Treat as untrusted until reviewed.

Copy this and send it to your AI assistant to learn

Install skill "edge-tts" with this command: npx skills add 17854566382/tts-1

Edge-TTS Skill

Overview

Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service via the node-edge-tts npm package. Supports multiple languages, voices, adjustable speed/pitch, and subtitle generation.

Quick Start

When you detect TTS intent from triggers or user request:

  1. Call the tts tool (Clawdbot built-in) to convert text to speech
  2. The tool returns a MEDIA: path
  3. Clawdbot routes the audio to the current channel
// Example: Built-in tts tool usage
tts("Your text to convert to speech")
// Returns: MEDIA: /path/to/audio.mp3

Trigger Detection

Recognize "tts" keyword as TTS requests. The skill automatically filters out TTS-related keywords from text before conversion to avoid converting the trigger words themselves to audio.

Advanced Customization

Using the Node.js Scripts

For more control, use the bundled scripts directly:

TTS Converter

cd scripts
npm install
node tts-converter.js "Your text" --voice en-US-AriaNeural --rate +10% --output output.mp3

Options:

  • --voice, -v: Voice name (default: en-US-AriaNeural)
  • --lang, -l: Language code (e.g., en-US, es-ES)
  • --format, -o: Output format (default: audio-24khz-48kbitrate-mono-mp3)
  • --pitch: Pitch adjustment (e.g., +10%, -20%, default)
  • --rate, -r: Rate adjustment (e.g., +10%, -20%, default)
  • --volume: Volume adjustment (e.g., +0%, -10%, default)
  • --save-subtitles, -s: Save subtitles as JSON file
  • --output, -f: Output file path (default: tts_output.mp3)
  • --proxy, -p: Proxy URL (e.g., http://localhost:7890)
  • --timeout: Request timeout in milliseconds (default: 10000)
  • --list-voices, -L: List available voices

Configuration Manager

cd scripts
npm install
node config-manager.js --set-voice en-US-AriaNeural

node config-manager.js --set-rate +10%

node config-manager.js --get

node config-manager.js --reset

Voice Selection

Common voices (use --list-voices for full list):

English:

  • en-US-MichelleNeural (female, natural, default)
  • en-US-AriaNeural (female, natural)
  • en-US-GuyNeural (male, natural)
  • en-GB-SoniaNeural (female, British)
  • en-GB-RyanNeural (male, British)

Other Languages:

  • es-ES-ElviraNeural (Spanish, Spain)
  • fr-FR-DeniseNeural (French)
  • de-DE-KatjaNeural (German)
  • ja-JP-NanamiNeural (Japanese)
  • zh-CN-XiaoxiaoNeural (Chinese)
  • ar-SA-ZariyahNeural (Arabic)

Rate Guidelines

Rate values use percentage format:

  • "default": Normal speed
  • "-20%" to "-10%": Slow, clear (tutorials, stories, accessibility)
  • "+10%" to "+20%": Slightly fast (summaries)
  • "+30%" to "+50%": Fast (news, efficiency)

Output Formats

Choose audio quality based on use case:

  • audio-24khz-48kbitrate-mono-mp3: Standard quality (voice notes, messages)
  • audio-24khz-96kbitrate-mono-mp3: High quality (presentations, content)
  • audio-48khz-96kbitrate-stereo-mp3: Highest quality (professional audio, music)

Resources

scripts/tts-converter.js

Main TTS conversion script using node-edge-tts. Generates audio files with customizable voice, rate, volume, pitch, and format. Supports subtitle generation and voice listing.

scripts/config-manager.js

Manages persistent user preferences for TTS settings (voice, language, format, pitch, rate, volume). Stores config in ~/.tts-config.json.

scripts/package.json

NPM package configuration with node-edge-tts dependency.

references/node_edge_tts_guide.md

Complete documentation for node-edge-tts npm package including:

  • Full voice list by language
  • Prosody options (rate, pitch, volume)
  • Usage examples (CLI and Module)
  • Subtitle generation
  • Output formats
  • Best practices and limitations

Voice Testing

Test different voices and preview audio quality at: https://tts.travisvn.com/

Refer to this when you need specific voice details or advanced features.

Installation

To use the bundled scripts:

cd /home/user/clawd/skills/public/tts-skill/scripts
npm install

This installs:

  • node-edge-tts - TTS library
  • commander - CLI argument parsing

Workflow

  1. Detect intent: Check for "tts" trigger or keyword in user message
  2. Choose method: Use built-in tts tool for simple requests, or scripts/tts-converter.js for customization
  3. Generate audio: Convert the target text (message, search results, summary)
  4. Return to user: The tts tool returns a MEDIA: path; Clawdbot handles delivery

Testing

Basic Test

Run the test script to verify TTS functionality:

cd /home/user/clawd/skills/public/edge-tts/scripts
npm test

This generates a test audio file and verifies the TTS service is working.

Voice Testing

Test different voices and preview audio quality at: https://tts.travisvn.com/

Integration Test

Use the built-in tts tool for quick testing:

// Example: Test TTS with default settings
tts("This is a test of the TTS functionality.")

Configuration Test

Verify configuration persistence:

cd /home/user/clawd/skills/public/edge-tts/scripts
node config-manager.js --get
node config-manager.js --set-voice en-US-GuyNeural
node config-manager.js --get

Troubleshooting

  • Test connectivity: Run npm test to check if TTS service is accessible
  • Check voice availability: Use node tts-converter.js --list-voices to see available voices
  • Verify proxy settings: If using proxy, test with node tts-converter.js "test" --proxy http://localhost:7890
  • Check audio output: The test should generate test-output.mp3 in the scripts directory

Notes

  • node-edge-tts uses Microsoft Edge's online TTS service (updated, working authentication)
  • No API key needed (free service)
  • Output is MP3 format by default
  • Requires internet connection
  • Supports subtitle generation (JSON format with word-level timing)
  • Temporary File Handling: By default, audio files are saved to the system's temporary directory (/tmp/edge-tts-temp/ on Unix, C:\Users\<user>\AppData\Local\Temp\edge-tts-temp\ on Windows) with unique filenames (e.g., tts_1234567890_abc123.mp3). Files are not automatically deleted - the calling application (Clawdbot) should handle cleanup after use. You can specify a custom output path with the --output option if permanent storage is needed.
  • TTS keyword filtering: The skill automatically filters out TTS-related keywords (tts, TTS, text-to-speech) from text before conversion to avoid converting the trigger words themselves to audio
  • For repeated preferences, use config-manager.js to set defaults
  • Default voice: en-US-MichelleNeural (female, natural)
  • Neural voices (ending in Neural) provide higher quality than Standard voices

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

axure-prototype-generator

Axure 原型代码生成器 - 输出 JavaScript 格式 HTML 代码,支持内联框架直接加载可交互原型。

Archived SourceRecently Updated
General

错敏信息检测

# 错敏检测 Skill

Archived SourceRecently Updated
General

TikTok B2B 引流台词生成器

# TikTok B2B 引流台词生成器 ## 技能描述 本 Skill 可根据您提供的产品信息和公司背景,自动生成适合 TikTok 平台的 B2B 引流视频脚本(20-50 秒),`skill.json` 文件中包含了输入参数的结构、输出格式以及用于生成台词的提示模板。脚本遵循已验证的外贸引流规律: - **真人出镜**:以第一人称(如 Anna)拉近距离 - **产品细节**:材质、颜色、MOQ、定制服务等 - **公司实力**:经验年限、自有工厂、认证等 - **客户背书**:提及已有市场国家(如巴基斯坦、埃及) - **互动引导**:清晰号召观众联系,引导至指定服务网址 支持三种风格:普通、幽默、惊喜,让您的视频内容更加多样化。 ## 输入参数 | 参数名 | 类型 | 必填 | 描述 | 示例 | |---------------------|----------|------|--------------------------------|--------------------------| | product_type | string | 是 | 产品类型 | 男士休闲鞋 | | material | string | 是 | 主要材质 | 优质 PU 皮革 | | colors | array | 是 | 颜色列表 | ["黑色","白色","棕色"] | | moq | string | 是 | 最小起订量 | 120 双(可混 2-3 色) | | customization | string | 否 | 可定制内容 | 可定制 logo | | target_markets | array | 是 | 主要市场国家 | ["巴基斯坦","埃及"] | | company_experience | string | 否 | 公司经验年数 | 15 年 | | factory_own | boolean | 否 | 是否自有工厂 | true | | extra_features | string | 否 | 其他亮点 | 免费样品 | | contact_url | string | 否 | 服务联系网址 | http://www.doumaotong.com | | style | string | 否 | 风格(普通/幽默/惊喜) | 普通 | ## 输出示例 Hi guys, this is Anna! Welcome to my showroom. Today I'm excited to show you our latest men's casual shoes – made of high-quality PU leather, very durable and comfortable. We have three colors available: black, white, and brown. MOQ is 120 pairs, and you can mix 2-3 colors. Plus, we can customize your logo on the shoes. Our shoes are already loved by customers in Pakistan, Egypt, and South Africa. With 15 years of experience and our own factory, we guarantee quality and timely delivery. We even offer free samples! If you're interested, please visit http://www.doumaotong.com to contact us. Thank you! ## 使用说明 1. 在 OpenClaw 平台安装此 Skill。 2. 调用时填写产品参数,包括 `contact_url`(默认为 http://www.doumaotong.com),即可获得定制化的 TikTok 脚本。 3. 生成的台词会在结尾处自然引导观众访问指定的服务网站。 4. 可根据实际需要调整 `style` 参数,生成不同语气的台词。 ## 文件说明 - `skill.json`:技能的机器可读定义,包含输入输出 schema 和生成提示模板。 - `SKILL.md`:技能的人类可读文档,提供详细说明和使用示例。

Archived SourceRecently Updated
General

instructional-design-cn

培训课程大纲设计、效果评估、内部分享材料生成

Archived SourceRecently Updated