Claude API
Build applications with the Anthropic Claude API and SDKs.
When to Activate
-
Building applications that call the Claude API
-
Code imports anthropic (Python) or @anthropic-ai/sdk (TypeScript)
-
User asks about Claude API patterns, tool use, streaming, or vision
-
Implementing agent workflows with Claude Agent SDK
-
Optimizing API costs, token usage, or latency
Model Selection
Model ID Best For
Opus 4.1 claude-opus-4-1
Complex reasoning, architecture, research
Sonnet 4 claude-sonnet-4-0
Balanced coding, most development tasks
Haiku 3.5 claude-3-5-haiku-latest
Fast responses, high-volume, cost-sensitive
Default to Sonnet 4 unless the task requires deep reasoning (Opus) or speed/cost optimization (Haiku). For production, prefer pinned snapshot IDs over aliases.
Python SDK
Installation
pip install anthropic
Basic Message
import anthropic
client = anthropic.Anthropic() # reads ANTHROPIC_API_KEY from env
message = client.messages.create( model="claude-sonnet-4-0", max_tokens=1024, messages=[ {"role": "user", "content": "Explain async/await in Python"} ] ) print(message.content[0].text)
Streaming
with client.messages.stream( model="claude-sonnet-4-0", max_tokens=1024, messages=[{"role": "user", "content": "Write a haiku about coding"}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True)
System Prompt
message = client.messages.create( model="claude-sonnet-4-0", max_tokens=1024, system="You are a senior Python developer. Be concise.", messages=[{"role": "user", "content": "Review this function"}] )
TypeScript SDK
Installation
npm install @anthropic-ai/sdk
Basic Message
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic(); // reads ANTHROPIC_API_KEY from env
const message = await client.messages.create({ model: "claude-sonnet-4-0", max_tokens: 1024, messages: [ { role: "user", content: "Explain async/await in TypeScript" } ], }); console.log(message.content[0].text);
Streaming
const stream = client.messages.stream({ model: "claude-sonnet-4-0", max_tokens: 1024, messages: [{ role: "user", content: "Write a haiku" }], });
for await (const event of stream) { if (event.type === "content_block_delta" && event.delta.type === "text_delta") { process.stdout.write(event.delta.text); } }
Tool Use
Define tools and let Claude call them:
tools = [ { "name": "get_weather", "description": "Get current weather for a location", "input_schema": { "type": "object", "properties": { "location": {"type": "string", "description": "City name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} }, "required": ["location"] } } ]
message = client.messages.create( model="claude-sonnet-4-0", max_tokens=1024, tools=tools, messages=[{"role": "user", "content": "What's the weather in SF?"}] )
Handle tool use response
for block in message.content: if block.type == "tool_use": # Execute the tool with block.input result = get_weather(**block.input) # Send result back follow_up = client.messages.create( model="claude-sonnet-4-0", max_tokens=1024, tools=tools, messages=[ {"role": "user", "content": "What's the weather in SF?"}, {"role": "assistant", "content": message.content}, {"role": "user", "content": [ {"type": "tool_result", "tool_use_id": block.id, "content": str(result)} ]} ] )
Vision
Send images for analysis:
import base64
with open("diagram.png", "rb") as f: image_data = base64.standard_b64encode(f.read()).decode("utf-8")
message = client.messages.create( model="claude-sonnet-4-0", max_tokens=1024, messages=[{ "role": "user", "content": [ {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": image_data}}, {"type": "text", "text": "Describe this diagram"} ] }] )
Extended Thinking
For complex reasoning tasks:
message = client.messages.create( model="claude-sonnet-4-0", max_tokens=16000, thinking={ "type": "enabled", "budget_tokens": 10000 }, messages=[{"role": "user", "content": "Solve this math problem step by step..."}] )
for block in message.content: if block.type == "thinking": print(f"Thinking: {block.thinking}") elif block.type == "text": print(f"Answer: {block.text}")
Prompt Caching
Cache large system prompts or context to reduce costs:
message = client.messages.create( model="claude-sonnet-4-0", max_tokens=1024, system=[ {"type": "text", "text": large_system_prompt, "cache_control": {"type": "ephemeral"}} ], messages=[{"role": "user", "content": "Question about the cached context"}] )
Check cache usage
print(f"Cache read: {message.usage.cache_read_input_tokens}") print(f"Cache creation: {message.usage.cache_creation_input_tokens}")
Batches API
Process large volumes asynchronously at 50% cost reduction:
import time
batch = client.messages.batches.create( requests=[ { "custom_id": f"request-{i}", "params": { "model": "claude-sonnet-4-0", "max_tokens": 1024, "messages": [{"role": "user", "content": prompt}] } } for i, prompt in enumerate(prompts) ] )
Poll for completion
while True: status = client.messages.batches.retrieve(batch.id) if status.processing_status == "ended": break time.sleep(30)
Get results
for result in client.messages.batches.results(batch.id): print(result.result.message.content[0].text)
Claude Agent SDK
Build multi-step agents:
Note: Agent SDK API surface may change — check official docs
import anthropic
Define tools as functions
tools = [{ "name": "search_codebase", "description": "Search the codebase for relevant code", "input_schema": { "type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"] } }]
Run an agentic loop with tool use
client = anthropic.Anthropic() messages = [{"role": "user", "content": "Review the auth module for security issues"}]
while True: response = client.messages.create( model="claude-sonnet-4-0", max_tokens=4096, tools=tools, messages=messages, ) if response.stop_reason == "end_turn": break # Handle tool calls and continue the loop messages.append({"role": "assistant", "content": response.content}) # ... execute tools and append tool_result messages
Cost Optimization
Strategy Savings When to Use
Prompt caching Up to 90% on cached tokens Repeated system prompts or context
Batches API 50% Non-time-sensitive bulk processing
Haiku instead of Sonnet ~75% Simple tasks, classification, extraction
Shorter max_tokens Variable When you know output will be short
Streaming None (same cost) Better UX, same price
Error Handling
import time
from anthropic import APIError, RateLimitError, APIConnectionError
try: message = client.messages.create(...) except RateLimitError: # Back off and retry time.sleep(60) except APIConnectionError: # Network issue, retry with backoff pass except APIError as e: print(f"API error {e.status_code}: {e.message}")
Environment Setup
Required
export ANTHROPIC_API_KEY="your-api-key-here"
Optional: set default model
export ANTHROPIC_MODEL="claude-sonnet-4-0"
Never hardcode API keys. Always use environment variables.