echoforge-moss-voice

Voice-first OpenClaw skill powered by MOSS APIs. Use when a user wants spoken replies in a preferred timbre, either from an existing voice_id or from a reference audio clip.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "echoforge-moss-voice" with this command: npx skills add xiami2019/moss-tts

EchoForge Moss Voice

Use this skill to run voice interaction with user-preferred timbre.

Required runtime config

  • MOSI_API_KEY (required)
  • MOSI_BASE_URL (optional, default https://studio.mosi.cn)

Always send:

  • Authorization: Bearer <MOSI_API_KEY>

Inputs

Collect:

  • text (required, what to speak)
  • Voice source (one of):
    • voice_id (preferred when available), or
    • reference_audio (public URL), or
    • local audio path (upload first, then clone voice)

Optional:

  • expected_duration_sec
  • sampling_params:
    • max_new_tokens (default 512)
    • temperature (default 1.7)
    • top_p (default 0.8)
    • top_k (default 25)
  • meta_info (default false)

Workflow

  1. Resolve voice source.
    • If voice_id is available, use it directly.
    • If only local audio path is available:
      • Upload file: POST /api/v1/files/upload with multipart field file.
      • Clone voice: POST /api/v1/voice/clone with file_id (or url).
      • If returned voice status is not active, poll GET /api/v1/voices/{voice_id} until ACTIVE or timeout.
    • If reference_audio URL is available, use it directly in TTS.
  2. Run TTS: POST /v1/audio/tts.
    • Required payload:
      • model: "moss-tts"
      • text
      • one of voice_id or reference_audio
  3. Parse response:
    • Decode audio_data (base64) to WAV.
    • Read duration_s and usage when present.
  4. Return a concise result:
    • voice_id used
    • output file path
    • duration
    • brief status message

Error handling

  • If 4010 or 4011: API key missing/invalid, ask user to fix MOSI_API_KEY.
  • If 4020: insufficient credits, ask user to recharge.
  • If 4029: rate limited, retry with exponential backoff.
  • If 5002: invalid audio URL or decode failed, ask user for another clip.
  • If 5004: timeout, shorten text and retry.

Operational constraints

  • Keep request rate <= 5 RPM.
  • Keep single request text short enough to avoid timeout.
  • Never print or log raw API keys.
  • Prefer reusing stable voice_id for multi-turn voice chat to reduce latency.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Smallest Ai

Ultra-fast text-to-speech and speech-to-text via Smallest AI's Lightning v3.1 and Pulse models. Use when the user wants to generate speech, convert text to v...

Registry SourceRecently Updated
90Profile unavailable
General

Elevenlabs Tts

ElevenLabs TTS (Text-to-Speech) with emotional audio tags for expressive voice synthesis. WhatsApp-compatible voice messages with Opus conversion. Supports 7...

Registry SourceRecently Updated
5K6Profile unavailable
General

Text to Speech

Generate speech audio from text using HeyGen's Starfish TTS model. Use when: (1) Generating standalone speech audio files from text, (2) Converting text to s...

Registry SourceRecently Updated
2681Profile unavailable
Automation

Cult Of Carcinization

Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to text, and evolve into a permanent bot identity. No human signup required.

Registry SourceRecently Updated
1.7K3Profile unavailable