gemini-live-phone

Bridge Twilio phone calls to Google Gemini Live API for real-time AI voice conversations. No STT/TTS middleware required. Includes VAD and echo suppression.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "gemini-live-phone" with this command: npx skills add quantdeveloperusa/gemini-live-phone

Gemini Live Phone Bridge

Real-time voice AI over phone calls using Google Gemini's native audio capabilities.

Architecture

Phone ↔ Twilio ↔ WebSocket (μ-law 8kHz) ↔ Bridge (PCM transcoding) ↔ Gemini Live API (24kHz PCM)

Quick Start

# Set required env vars
export GOOGLE_API_KEY="your-key"
export TWILIO_AUTH_TOKEN="your-token"

# Run the bridge
python scripts/bridge.py --port 3335

Endpoints

EndpointMethodDescription
/gemini-live/statusGETHealth check + active calls
/gemini-live/incomingPOSTTwiML for inbound calls (Twilio webhook)
/gemini-live/streamWSTwilio Media Stream WebSocket
/gemini-live/callPOSTInitiate outbound call
/gemini-live/twimlPOSTTwiML for outbound calls
/gemini-live/call-statusPOSTTwilio call status webhook

Outbound Call API

curl -X POST https://your-domain/gemini-live/call \
  -H 'Content-Type: application/json' \
  -d '{"to": "+1234567890", "greeting": "Hello! This is Marcia."}'

Configuration

All settings via CLI args or environment variables:

Core

  • --model — Gemini model (default: gemini-2.5-flash-native-audio-latest)
  • --voice — Gemini voice: Puck, Charon, Kore, Fenrir, Aoede, Leda, Orus, Zephyr (default: Kore)
  • --from-number — Twilio outbound number (default: env TWILIO_FROM)
  • --system-prompt — AI persona system prompt
  • --max-duration — Max call seconds (default: 300)

VAD (Voice Activity Detection)

  • --vad-enabled / --no-vad — Toggle server-side VAD (default: on)
  • --vad-silence-ms — Silence duration to trigger activityEnd (default: 500)
  • --vad-energy-threshold — RMS energy threshold (default: 0.01)
  • --vad-speech-min-ms — Min speech duration before activityStart (default: 100)

Echo Suppression

  • --echo-multiplier — VAD threshold multiplier during agent speech (default: 3.0)
  • --echo-decay-ms — Decay time after agent stops speaking (default: 300)

Twilio Setup

  1. Buy a phone number on Twilio
  2. Set Voice webhook: https://your-domain/gemini-live/incoming (HTTP POST)
  3. Set Call status URL: https://your-domain/gemini-live/call-status (HTTP POST)
  4. Ensure geo-permissions are enabled for target countries

Network Requirements

The bridge must be accessible from the internet (Twilio connects to it). Recommended: Caddy reverse proxy with WebSocket support.

# Caddy config example
handle /gemini-live/* {
    reverse_proxy localhost:3335 {
        flush_interval -1
        transport http {
            read_timeout 0
            write_timeout 0
        }
    }
}

Performance

Latency benchmarks (Gemini 2.5 Flash Native Audio):

ConfigMedianMinMax
No VAD, 200ms buffer3,660ms2,360ms5,180ms
Server VAD, 50ms buffer2,500ms2,080ms6,980ms

Server-side VAD reduces median latency by ~32%.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Aavegotchi Renderer Bypass

Render Aavegotchi assets by deriving renderer hashes from Goldsky Base core data and calling POST /api/renderer/batch on www.aavegotchi.com. Use when the use...

Registry SourceRecently Updated
General

Toutiao User Profile API

Call GET /api/toutiao/get-user-detail/v1 for Toutiao User Profile through JustOneAPI with userId.

Registry SourceRecently Updated
General

Toutiao Search API

Call 2 search versions for Toutiao App Keyword Search through JustOneAPI with keyword.

Registry SourceRecently Updated
General

Compaction UI Enhancements

Background memory compaction with auto-trigger, chat summary paragraph, configurable threshold, model selector, settings tab, and result storage for OpenClaw...

Registry SourceRecently Updated