agent-right-brain

Give agents creative abilities using `rawgenai` — speak, listen, generate images/videos/music/sound effects, create multi-speaker dialogue, and manage voices. Use this skill when the user asks to "speak", "talk", "read aloud", "transcribe", "generate an image", "create a picture", "draw", "edit an image", "generate a video", "create a video", "animate", "generate music", "create a song", "generate sound effects", "create dialogue", "design a voice", "clone a voice", or any request involving voice, audio, image, or video creation.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-right-brain" with this command: npx skills add whq25/rawgenai/whq25-rawgenai-agent-right-brain

Agent Right Brain

Use rawgenai <provider> <action> to give agents creative abilities. Always read the chosen provider's reference file before running commands.

Prerequisites

brew install WHQ25/tap/rawgenai

Before using a provider, read its setup guide at references/setup/ to configure credentials.

Input Sources (All Capabilities)

  1. Positional argument: rawgenai <provider> <action> "text" [flags]
  2. File: rawgenai <provider> <action> --file input.txt [flags]
  3. Stdin: echo "text" | rawgenai <provider> <action> [flags]

General Guidelines

  • On first use of a capability, ask user to pick a provider. Remember for the session.
  • All output is JSON. Always show file paths to the user.
  • For async commands (video, some image/audio): create -> status -> download.
  • If a command fails, try a different provider or inform the user.
  • Write image/video prompts descriptively: subject + action + environment + style + lighting.
  • For TTS: write natural conversational text, not markdown. Use --speak for playback, -o for file.

Speak (TTS)

rawgenai <provider> tts "<text>" --speak

ProviderCommandBest ForReference
OpenAIrawgenai openai ttsGeneral purpose, Englishref
Google Geminirawgenai google ttsExpressive storytelling, multi-speakerref
ElevenLabsrawgenai elevenlabs ttsMost natural voices, 70+ languagesref
Seedrawgenai seed ttsChinese, emotion-richref
DashScoperawgenai dashscope ttsChinese, 10 languages, 49 voicesref
MiniMaxrawgenai minimax ttsChinese, streamingref
Klingrawgenai kling ttsBilingual zh/enref
Runwayrawgenai runway audio ttsAsync

Listen (STT)

rawgenai <provider> stt <audio-file>

ProviderCommandBest ForReference
OpenAIrawgenai openai sttSubtitles (srt/vtt)ref
Google Geminirawgenai google sttSpeaker diarizationref
ElevenLabsrawgenai elevenlabs sttLarge files (3GB), video inputref
DashScoperawgenai dashscope sttChinese, emotion, long audio (12h async)ref

Image

rawgenai <provider> image "<prompt>" -o output.png

ProviderCommandBest ForReference
OpenAIrawgenai openai imageTransparent bg, editing, multi-turnref
Google Geminirawgenai google image4K, text in imageref
Grokrawgenai grok imageBatch (up to 10)ref
Seedrawgenai seed image4K, multi-image fusionref
DashScoperawgenai dashscope imageText rendering, Chineseref
MiniMaxrawgenai minimax imageSubject referenceref
Klingrawgenai kling imageFace reference (async)ref
Lumarawgenai luma imageCreative, reframe (async)
Hunyuanrawgenai hunyuan imageChinese (async)
Runwayrawgenai runway imageCinematic (async)

Video

rawgenai <provider> video create "<prompt>" [flags]status <id>download <id> -o out.mp4

ProviderCommandBest ForReference
OpenAI (Sora)rawgenai openai videoRemixref
Google (Veo)rawgenai google video4K, extensionref
Grokrawgenai grok videoQuick, editingref
Seedrawgenai seed videoAudio, wide ratiosref
DashScoperawgenai dashscope videoCharacter ref, multi-shotref
MiniMax (Hailuo)rawgenai minimax videoSubject ref, director modesref
Klingrawgenai kling videoMost advanced, element systemref
Lumarawgenai luma videoExtension, upscale
Hunyuanrawgenai hunyuan videoChinese
Runwayrawgenai runway videoCinematic, character ref

Music

ProviderCommandBest ForReference
ElevenLabsrawgenai elevenlabs musicPrompt-based, composition plansref
MiniMaxrawgenai minimax music createLyrics-to-music, Chineseref

Sound Effects (SFX)

ProviderCommandReference
ElevenLabsrawgenai elevenlabs sfx "<prompt>" -o out.mp3ref
Runwayrawgenai runway audio sfx "<prompt>"

Dialogue

Multi-speaker dialogue from JSON script (max 10 voices).

ProviderCommandReference
ElevenLabsrawgenai elevenlabs dialogue -i script.json -o out.mp3ref

Voice Management

Design, clone, and manage custom voices.

ProviderCommandCapabilitiesReference
ElevenLabsrawgenai elevenlabs voicelist, design, create, previewref
Klingrawgenai kling voicecreate, status, list, deleteref
MiniMaxrawgenai minimax voicelist, upload, clone, design, deleteref
Seedrawgenai seed voice-cloneupload, status, order, renewref

Audio Processing

Async: rawgenai runway audio <action>status <id>download <id> -o out

ProviderCommandCapability
Runwayrawgenai runway audio stsSpeech-to-speech (voice conversion)
Runwayrawgenai runway audio dubbingDub audio to another language
Runwayrawgenai runway audio isolationIsolate voice from background

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

agent-canvas

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

clinic-visit-prep

帮助患者整理就诊前问题、既往记录、检查清单与时间线,不提供诊断。;use for healthcare, intake, prep workflows;do not use for 给诊断结论, 替代医生意见.

Archived SourceRecently Updated
Automation

changelog-curator

从变更记录、提交摘要或发布说明中整理对外 changelog,并区分用户价值与内部改动。;use for changelog, release-notes, docs workflows;do not use for 捏造未发布功能, 替代正式合规审批.

Archived SourceRecently Updated