avatar

Interactive AI avatar with Simli video rendering and ElevenLabs TTS

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "avatar" with this command: npx skills add johannes-berggren/avatar

Avatar Skill

Interactive AI avatar interface for OpenClaw with real-time lip-synced video and text-to-speech.

Features

  • Voice Responses: Speaks conversational summaries using ElevenLabs TTS
  • Visual Avatar: Realistic lip-synced video via Simli
  • Detail Panel: Shows formatted markdown alongside spoken responses
  • Multi-language: Supports multiple languages for speech and TTS
  • Slack/Email: Forward responses to Slack DMs or email (when configured)
  • Stream Deck: Optional hardware control with Elgato Stream Deck

Setup

  1. Get API keys:

  2. Set environment variables:

    export SIMLI_API_KEY=your-key
    export ELEVENLABS_API_KEY=your-key
    
  3. Start the avatar:

    openclaw-avatar
    
  4. Open http://localhost:5173

Response Format

When responding to avatar queries, use this format:

<spoken>
A short conversational summary (1-3 sentences). NO markdown, NO formatting. Plain speech only.
</spoken>
<detail>
Full detailed response with markdown formatting (bullet points, headers, bold, etc).
</detail>

Guidelines

  1. spoken: Brief, natural, conversational. This is read aloud.
  2. detail: Comprehensive information with proper markdown.
  3. Always include both sections.

Example

User: "What meetings do I have today?"

<spoken>
You have three meetings today. Your first one is a team standup at 9 AM, then a product review at 2 PM, and finally a 1-on-1 with Sarah at 4 PM.
</spoken>
<detail>
## Today's Meetings

### 9:00 AM - Team Standup
- **Duration**: 15 minutes
- **Attendees**: Engineering team

### 2:00 PM - Product Review
- **Duration**: 1 hour
- **Attendees**: Product, Design, Engineering leads

### 4:00 PM - 1:1 with Sarah
- **Duration**: 30 minutes
- **Notes**: Follow up on project timeline
</detail>

Session Key

Avatar responses use session key: agent:main:avatar

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Dlazy Kling Audio Clone

Generate customized speech that highly restores the timbre by uploading reference audio using Kling Audio Clone.

Registry SourceRecently Updated
General

Dlazy Keling Sfx

Generate matching scene sound effects based on text descriptions or video frames using Kling SFX.

Registry SourceRecently Updated
General

Dlazy Keling Tts

Convert text into high-quality, emotional speech reading using Kling TTS.

Registry SourceRecently Updated
General

Dlazy Jimeng T2i

Text-to-image generation with Jimeng, quickly converting text to high-quality images.

Registry SourceRecently Updated