novita-multimodal

Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating videos, text-to-speech, speech recognition.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "novita-multimodal" with this command: npx skills add ximasadila/novita-multimodal

Novita AI Multimodal Execution

Configuration (choose one, by priority)

Method 1: Config File (Recommended)

Create file ~/.novita/config.json:

{
  "api_key": "YOUR_API_KEY"
}

One command setup:

mkdir -p ~/.novita && echo '{"api_key": "YOUR_API_KEY"}' > ~/.novita/config.json

Method 2: Environment Variable

export NOVITA_API_KEY="YOUR_API_KEY"

Method 3: Direct Parameter

Include in request: Please use API Key sk_xxx to generate an image...


API Key Reading Logic

1. Check if user message contains API Key (starts with sk_)
2. Check config file ~/.novita/config.json
3. Check environment variable NOVITA_API_KEY
4. None found → Return configuration guide

Configuration guide (only shown when not configured):

You have not configured your Novita AI API Key.

Quick setup (copy and run):
mkdir -p ~/.novita && echo '{"api_key": "YOUR_KEY"}' > ~/.novita/config.json

Get Key: https://novita.ai/settings/key-management

Execution Flow (Important!)

User request → Identify task → Get Key → ⚠️ Send prompt first → Execute task → Return result

⚠️ Must Send Progress Prompt First

Before calling the API, you must reply to the user with a message:

🎨 Got it! Generating your image...

Task type: Text-to-Image
Model: Seedream 5.0 Lite
Estimated time: 5-15 seconds
Estimated cost: ~$0.035

Please wait, will send as soon as it's ready ⏳

This message must be sent BEFORE executing the API call! This way users know the task is being processed and won't think the system is stuck.

Progress Templates for Different Tasks

Text-to-Image:

🎨 Got it! Generating your image...
Model: Seedream 5.0 Lite
Estimated time: 5-15 seconds

Text-to-Video:

🎬 Got it! Generating your video...
Model: Vidu Q3 Pro
Estimated time: 1-3 minutes (video generation is slower, please be patient)

TTS:

🔊 Got it! Generating your audio...
Model: MiniMax Speech 2.8 Turbo
Estimated time: 5-15 seconds

Completion Response

✅ Generation complete!

[Image/Video/Audio URL]

Actual cost: $0.035

Video Task Polling Updates

Video generation requires polling, update status every 15 seconds:

🎬 Video generating...
Current status: Processing
Elapsed: 30 seconds
Estimated remaining: 1-2 minutes

API Configuration

SettingValue
Base URLhttps://api.novita.ai
AuthAuthorization: Bearer <API_KEY>
Get Keyhttps://novita.ai/settings/key-management

Task Types and Endpoints

TaskEndpointModel
Text-to-Image/v3/seedream-5.0-liteSeedream 5.0 Lite
Image Editing/v3/seedream-5.0-liteSeedream 5.0 Lite
Text-to-Video/v3/async/vidu-q3-pro-t2vVidu Q3 Pro
Image-to-Video/v3/async/vidu-q3-pro-i2vVidu Q3 Pro
TTS/v3/async/minimax-speech-2.8-turboMiniMax Speech 2.8
STT/v3/glm-asrGLM ASR
Task Query/v3/async/task-result?task_id=xxx-

Execution Templates

Text-to-Image

curl -X POST "https://api.novita.ai/v3/seedream-5.0-lite" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "description"}'

Image Editing

curl -X POST "https://api.novita.ai/v3/seedream-5.0-lite" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "edit instruction", "reference_images": ["image_url"]}'

Text-to-Video

curl -X POST "https://api.novita.ai/v3/async/vidu-q3-pro-t2v" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "description", "duration": 4}'

Image-to-Video

curl -X POST "https://api.novita.ai/v3/async/vidu-q3-pro-i2v" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "motion description", "images": ["image_url"]}'

TTS

curl -X POST "https://api.novita.ai/v3/async/minimax-speech-2.8-turbo" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "text to convert",
    "voice_setting": {"voice_id": "male-qn-qingse", "speed": 1.0},
    "audio_setting": {"format": "mp3"}
  }'

Available voices:

  • Male: male-qn-qingse, male-qn-jingying
  • Female: female-shaonv, female-yujie

STT

curl -X POST "https://api.novita.ai/v3/glm-asr" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"file": "audio_url_or_base64"}'

Task Result Query

curl "https://api.novita.ai/v3/async/task-result?task_id=$TASK_ID" \
  -H "Authorization: Bearer $API_KEY"

Status: TASK_STATUS_QUEUEDTASK_STATUS_PROCESSINGTASK_STATUS_SUCCEED


Error Handling

CodeMeaningAction
401Invalid KeyCheck configuration
402Insufficient balanceTop up at https://novita.ai/billing
429Rate limitedWait and retry

Pricing

https://novita.ai/pricing

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Leads

Leads - command-line tool for everyday use

Registry SourceRecently Updated
General

Bmi Calculator

BMI计算器。BMI计算、理想体重、健康计划、体重追踪、儿童BMI、结果解读。BMI calculator with ideal weight, health plan. BMI、体重、健康。

Registry SourceRecently Updated
General

Blood

Blood — a fast health & wellness tool. Log anything, find it later, export when needed.

Registry SourceRecently Updated
General

Better Genshin Impact

📦BetterGI · 更好的原神 - 自动拾取 | 自动剧情 | 全自动钓鱼(AI) | 全自动七圣召唤 | 自动伐木 | 自动刷本 | 自动采集/挖矿/锄地 | 一条龙 | 全连音游 - UI A better genshin impact, c#, auto-play-game, automatic, g...

Registry SourceRecently Updated