Streaming LLM Responses

Build responsive, real-time chat interfaces with streaming feedback.

Quick Start

import { useChatKit } from "@openai/chatkit-react";

const chatkit = useChatKit({ api: { url: API_URL, domainKey: DOMAIN_KEY },

onResponseStart: () => setIsResponding(true), onResponseEnd: () => setIsResponding(false),

onEffect: ({ name, data }) => { if (name === "update_status") updateUI(data); }, });

Response Lifecycle

User sends message ↓ onResponseStart() fires ↓ [Streaming: tokens arrive, ProgressUpdateEvents shown] ↓ onResponseEnd() fires ↓ UI unlocks, ready for next interaction

Core Patterns

Response Lifecycle Handlers

Lock UI during AI response to prevent race conditions:

function ChatWithLifecycle() { const [isResponding, setIsResponding] = useState(false); const lockInteraction = useAppStore((s) => s.lockInteraction); const unlockInteraction = useAppStore((s) => s.unlockInteraction);

const chatkit = useChatKit({ api: { url: API_URL, domainKey: DOMAIN_KEY },

onResponseStart: () => {
  setIsResponding(true);
  lockInteraction(); // Disable map/canvas/form interactions
},

onResponseEnd: () => {
  setIsResponding(false);
  unlockInteraction();
},

onError: ({ error }) => {
  console.error("ChatKit error:", error);
  setIsResponding(false);
  unlockInteraction();
},

});

return ( <div> {isResponding && <LoadingOverlay />} <ChatKit control={chatkit.control} /> </div> ); }

Client Effects (Fire-and-Forget)

Server sends effects to update client UI without expecting a response:

Backend - Streaming Effects:

from chatkit.types import ClientEffectEvent

async def respond(self, thread, item, context): # ... agent processing ...

# Fire client effect to update UI
yield ClientEffectEvent(
    name="update_status",
    data={
        "state": {"energy": 80, "happiness": 90},
        "flash": "Status updated!"
    }
)

# Another effect
yield ClientEffectEvent(
    name="show_notification",
    data={"message": "Task completed!"}
)

Frontend - Handling Effects:

const chatkit = useChatKit({ api: { url: API_URL, domainKey: DOMAIN_KEY },

onEffect: ({ name, data }) => { switch (name) { case "update_status": applyStatusUpdate(data.state); if (data.flash) setFlashMessage(data.flash); break;

  case "add_marker":
    addMapMarker(data);
    break;

  case "select_mode":
    setSelectionMode(data.mode);
    break;
}

}, });

Progress Updates

Show "Searching...", "Loading...", "Analyzing..." during long operations:

from chatkit.types import ProgressUpdateEvent

@function_tool async def search_articles(ctx: AgentContext, query: str) -> str: """Search for articles matching the query."""

yield ProgressUpdateEvent(message="Searching articles...")

results = await article_store.search(query)

yield ProgressUpdateEvent(message=f"Found {len(results)} articles...")

for i, article in enumerate(results):
    if i % 5 == 0:
        yield ProgressUpdateEvent(
            message=f"Processing article {i+1}/{len(results)}..."
        )

return format_results(results)

4. Thread Lifecycle Events

Track thread changes for persistence and UI updates:

const chatkit = useChatKit({ api: { url: API_URL, domainKey: DOMAIN_KEY },

onThreadChange: ({ threadId }) => { setThreadId(threadId); if (threadId) localStorage.setItem("lastThreadId", threadId); clearSelections(); },

onThreadLoadStart: ({ threadId }) => { setIsLoadingThread(true); },

onThreadLoadEnd: ({ threadId }) => { setIsLoadingThread(false); }, });

Client Tools (State Query)

AI needs to read client-side state to make decisions:

Backend - Defining Client Tool:

@function_tool(name_override="get_selected_items") async def get_selected_items(ctx: AgentContext) -> dict: """Get the items currently selected on the canvas.

This is a CLIENT TOOL - executed in browser, result comes back.
"""
yield ProgressUpdateEvent(message="Reading selection...")
pass  # Actual execution happens on client

Frontend - Handling Client Tools:

const chatkit = useChatKit({ api: { url: API_URL, domainKey: DOMAIN_KEY },

onClientTool: ({ name, params }) => { switch (name) { case "get_selected_items": return { itemIds: selectedItemIds };

  case "get_current_viewport":
    return {
      center: mapRef.current.getCenter(),
      zoom: mapRef.current.getZoom(),
    };

  case "get_form_data":
    return { values: formRef.current.getValues() };

  default:
    throw new Error(`Unknown client tool: ${name}`);
}

}, });

Client Effects vs Client Tools

Type Direction Response Required Use Case

Client Effect Server → Client No (fire-and-forget) Update UI, show notifications

Client Tool Server → Client → Server Yes (return value) Get client state for AI decision

Common Patterns by Use Case

Interactive Map/Canvas

onResponseStart: () => lockCanvas(), onResponseEnd: () => unlockCanvas(), onEffect: ({ name, data }) => { if (name === "add_marker") addMarker(data); if (name === "pan_to") panTo(data.location); }, onClientTool: ({ name }) => { if (name === "get_selection") return getSelectedItems(); },

Form-Based UI

onResponseStart: () => setFormDisabled(true), onResponseEnd: () => setFormDisabled(false), onClientTool: ({ name }) => { if (name === "get_form_values") return form.getValues(); },

Game/Simulation

onResponseStart: () => pauseSimulation(), onResponseEnd: () => resumeSimulation(), onEffect: ({ name, data }) => { if (name === "update_entity") updateEntity(data); if (name === "show_notification") showToast(data.message); },

Thread Title Generation

Dynamically update thread title based on conversation:

class TitleAgent: async def generate_title(self, first_message: str) -> str: result = await Runner.run( Agent( name="TitleGenerator", instructions="Generate a 3-5 word title.", model="gpt-4o-mini", # Fast model ), input=f"First message: {first_message}", ) return result.final_output

In ChatKitServer

async def respond(self, thread, item, context): if not thread.title and item: title = await self.title_agent.generate_title(item.content) thread.title = title await self.store.save_thread(thread, context)

Anti-Patterns

Not locking UI during response - Leads to race conditions
Blocking in effects - Effects should be fire-and-forget
Heavy computation in onEffect - Use requestAnimationFrame for DOM updates
Missing error handling - Always handle onError to unlock UI
Not persisting thread state - Use onThreadChange to save context

Verification

Run: python3 scripts/verify.py

Expected: ✓ streaming-llm-responses skill ready

If Verification Fails

Check: references/ folder has streaming-patterns.md
Stop and report if still failing

References

references/streaming-patterns.md - Complete streaming configuration

streaming-llm-responses

Safety Notice

Copy this and send it to your AI assistant to learn

In ChatKitServer

Source Transparency

Related Skills

working-with-spreadsheets

browsing-with-playwright

working-with-documents