gemini-voice-assistant

Voice-to-voice AI assistant using Gemini Live API. Speak to the AI and get spoken responses. Use when you want to have natural voice conversations with an AI assistant powered by Google's Gemini models.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "gemini-voice-assistant" with this command: npx skills add alimostafaradwan/gemini-voice-assistant

Gemini Voice Assistant

A voice-to-voice AI assistant powered by Google's Gemini Live API. Speak to the AI and it responds with natural-sounding voice.

Usage

Text Mode

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py "Your question or message"

Voice Mode

cd ~/.openclaw/agents/kashif/skills/gemini-assistant && python3 handler.py --audio /path/to/audio.ogg "optional context"

Response Format

The handler returns a JSON response:

{
  "message": "[[audio_as_voice]]\nMEDIA:/tmp/gemini_voice_xxx.ogg",
  "text": "Text response from Gemini"
}

Configuration

Set your Gemini API key:

export GEMINI_API_KEY="your-api-key-here"

Or create a .env file in the skill directory:

GEMINI_API_KEY=your-api-key-here

Model Options

The default model is gemini-2.5-flash-native-audio-preview-12-2025 for audio support.

To use a different model, edit handler.py:

MODEL = "gemini-2.0-flash-exp"  # For text-only

Requirements

  • google-genai>=1.0.0
  • numpy>=1.24.0
  • soundfile>=0.12.0
  • librosa>=0.10.0 (for audio input)
  • FFmpeg (for audio conversion)

Features

  • 🎙️ Voice input/output support
  • 💬 Text conversations
  • 🔧 Configurable system instructions
  • ⚡ Fast responses with Gemini Flash

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Cclaw

Open-source comedy AI + video editing + poster generation. Create standup/sketch/manzai/scripts, edit videos via FFmpeg, and generate comedy posters via canv...

Registry SourceRecently Updated
General

Bird Recognition Tool | 鸟类识别工具

Identifies bird species in images/videos of target areas. Supports recognition of no less than 500 common bird species, supports customized model training, s...

Registry SourceRecently Updated
General

Image Amazon Product Image Suite

A professional product image generation skill purpose-built for the Amazon e-commerce platform. Outputs comply with Amazon's image guidelines while optimizin...

Registry SourceRecently Updated
General

SearchOnlineAssets

Online asset search tool: queries public stock libraries (Pixabay) for high-quality photos, illustrations, vectors and videos, returning result metadata and...

Registry SourceRecently Updated