turbo

Turbo

Direct code generation via hosted LLM. Claude writes the contract, Cerebras implements the code, files are written directly to disk.

Announce: "I'm using speed-run:turbo for hosted code generation."

When to Use

Use turbo for:

Algorithmic code (rate limiters, parsers, state machines)
Multiple files (3+)
Boilerplate-heavy implementations
Token-constrained sessions

Use Claude direct instead for:

CRUD/storage operations (Claude is cheaper due to no fix overhead)
Single implementation with complex coordination
Speed-critical tasks where fix cycles are costly

Tradeoffs

Aspect Claude Direct Turbo (Hosted LLM)

Speed ~10s ~0.5s

Token Cost Higher ~90% savings

First-pass Quality ~100% 80-95%

Fixes Needed 0 0-2 typical

Workflow

Step 1: Write Contract Prompt

Structure your prompt with exact specifications:

Build [X] with [tech stack].

DATA CONTRACT (use exactly these models):

[Pydantic models / interfaces with exact field names and types]

Example: class Task(BaseModel): id: str title: str completed: bool = False created_at: datetime

class TaskCreate(BaseModel): title: str

API CONTRACT (use exactly these routes):

POST /tasks -> Task # Create task GET /tasks -> list[Task] # List all tasks GET /tasks/{id} -> Task # Get single task DELETE /tasks/{id} -> dict # Delete task POST /reset -> dict # Reset state (for testing)

ALGORITHM:

[Step-by-step logic for the implementation]
[Include state management details]
[Include edge case handling]

RULES:

Use FastAPI with uvicorn
Store data in [storage mechanism]
Return 404 for missing resources
POST /reset must clear all state and return {"status": "ok"}

Step 2: Generate Code

mcp__speed-run__generate_and_write_files prompt: [contract prompt] output_dir: [target directory]

Returns only metadata (files written, line counts). Claude never sees the generated code.

Step 3: Run Tests

Run the test suite against generated code.

Step 4: Fix (if needed)

For failures, use Claude Edit tool for surgical fixes (typically 1-4 lines each).

Common fixes:

Error Type Frequency Fix Complexity

Missing utility functions Occasional 4 lines

Logic edge cases Occasional 1-2 lines

Import ordering Rare 1 line

Step 5: Re-test

Repeat Steps 3-4 until all tests pass. Even with fixes, total token cost is much lower than Claude generating everything.

What Hosted LLM Gets Right (~90%)

Data models match contract exactly
Routes/endpoints correct
Core algorithm logic
Basic error handling

Configuration

Variable Default Description

CEREBRAS_API_KEY

(required) Your API key

CEREBRAS_MODEL

gpt-oss-120b

Model to use

Available models:

Model Price (in/out) Speed Notes

gpt-oss-120b

$0.35/$0.75 3000 t/s Default - best value, clean output

llama-3.3-70b

$0.85/$1.20 2100 t/s Reliable fallback

qwen-3-32b

$0.40/$0.80 2600 t/s Has verbose <think> tags

llama3.1-8b

$0.10/$0.10 2200 t/s Cheapest, may need more fixes

Safety Notice

Copy this and send it to your AI assistant to learn

DATA CONTRACT (use exactly these models):

API CONTRACT (use exactly these routes):

ALGORITHM:

RULES:

Source Transparency

Related Skills

firebase-development:add-feature

css-development:validate

firebase-development:project-setup

css-development:refactor