langfuse

Instrument joelclaw LLM calls with Langfuse tracing. Covers the @langfuse/tracing SDK, observation hierarchy (spans, generations, tools, agents), propagateAttributes for userId/sessionId/tags, the pi-session extension (langfuse-cost), and the system-bus OTEL integration. Use when adding Langfuse traces, debugging missing/broken traces, checking cost data, or improving observability on any LLM surface.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "langfuse" with this command: npx skills add joelhooks/joelclaw/joelhooks-joelclaw-langfuse

Langfuse Observability

Langfuse is the LLM observability layer for joelclaw. Every LLM call produces a Langfuse trace with nested hierarchy, I/O, usage, cost, and attribution.

Architecture

joelclaw has two Langfuse integration points:

1. Pi-session extension (langfuse-cost)

  • Source: pi/extensions/langfuse-cost/index.ts (canonical, git-tracked in this repo)
  • Runtime: loaded as a pi extension from the same source tree
  • What it traces: Every gateway + interactive pi session LLM call
  • How: Hooks into pi session events (session_start, message_start, message_end, tool_call, tool_result, session_shutdown)
  • Dedup: globalThis.__langfuse_cost_loaded__ guard prevents duplicate extension instances
  • Optional dependency behavior: langfuse is lazily loaded (no top-level hard import). Missing module must disable telemetry, not crash extension import. Regression test: pi/extensions/langfuse-cost/index.test.ts
  • Runtime dependency location: because the extension is loaded from pi/extensions/ at repo root instead of a workspace package, the langfuse npm package must be available from the repo root package.json. If root install drift drops it, gateway/session telemetry silently degrades to the optional-dependency warning again.

2. System-bus OTEL bridge (langfuse.ts)

  • Source: packages/system-bus/src/lib/langfuse.ts
  • What it traces: All Inngest function LLM calls (reflect, triage, email cleanup, docs ingest)
  • How: @langfuse/otel LangfuseSpanProcessor + @langfuse/tracing startObservation()
  • Produces: joelclaw.inference traces with generation children

Current Trace Hierarchy (pi-session)

The langfuse-cost extension produces a 4-level nested span hierarchy:

joelclaw.session (trace)
  └── session (span) — entire session lifetime
        └── turn-1 (span) — user message → final assistant response
        │     ├── tool:bash (span) — individual tool execution
        │     ├── tool:read (span)
        │     └── llm.call (generation) — the LLM API call with usage/cost
        └── turn-2 (span)
              ├── tool:edit (span)
              ├── tool:bash (span)
              └── llm.call (generation)

What each level captures

LevelCreated onEnded onContains
joelclaw.session tracesession_startsession_shutdownuserId, sessionId, tags, turn count
session spansession_startsession_shutdownChannel, session type, turn count
turn-N spanmessage_start[user]message_end[assistant] with text outputUser input (clean), sourceChannel metadata
tool:name spantool_call eventtool_result eventTool input, output (truncated 500 chars)
llm.call generationmessage_end[assistant]immediateModel, usage, cache tokens, cost, I/O

Channel header stripping

User messages from Telegram arrive with a ---\nChannel:...\n--- header. The extension:

  1. Strips the header from trace input (clean user text only)
  2. Parses known keys (channel, date, platform_capabilities) into sourceChannel metadata
  3. Skips multi-line values (e.g. formatting_guide)

Credentials

Langfuse creds in agent-secrets:

  • langfuse_public_keypk-lf-cb8b...
  • langfuse_secret_keysk-lf-c86f...
  • langfuse_base_urlhttps://us.cloud.langfuse.com

Gateway gets them via gateway-start.sh env exports. System-bus resolves via env → secrets lease fallback.

Trace Conventions

Naming

  • Pi-session: joelclaw.session (trace) → sessionturn-Ntool:namellm.call
  • System-bus: joelclaw.inference (trace) → generation children

Required Attributes

Every trace MUST have:

  • userId: "joel"
  • sessionId — pi session ID for grouping
  • tags — minimum: ["joelclaw", "pi-session"]
  • Dynamic tags: provider:anthropic, model:anthropic/claude-opus-4-6, channel:central, session:central

Metadata Shape (flat, filterable)

{
  channel: "central",           // GATEWAY_ROLE env
  sessionType: "central",       // "gateway" | "interactive" | "codex" | "central"
  component: "pi-session",
  model: "anthropic/claude-opus-4-6",
  provider: "anthropic",
  stopReason: "toolUse",        // or "endTurn"
  turnCount: 5,                 // Updated on each turn
  sourceChannel: {              // Only on first user message per turn
    channel: "telegram",
    date: "...",
    platform_capabilities: "..."
  },
  tools: ["bash", "read"],      // Tool names used this turn
}

Generation usageDetails

{
  input: 1,                      // Non-cached input tokens
  output: 97,                    // Output tokens
  total: 68195,                  // Total tokens
  cache_read_input_tokens: 67877, // 90% discount
  cache_write_input_tokens: 220,  // 25% premium (NOT priced by Langfuse — known gap)
}

Pi session guardrails (alert-only)

Long-running pi sessions can dominate Langfuse spend. The extension now tracks per-session totals and emits warnings only on first threshold breach per guardrail type:

  • JOELCLAW_LANGFUSE_ALERT_MAX_LLM_CALLS (default: 120)
  • JOELCLAW_LANGFUSE_ALERT_MAX_TOTAL_TOKENS (default: 1200000)
  • JOELCLAW_LANGFUSE_ALERT_MAX_COST_USD (default: 20)

Behavior:

  • no automatic model switch
  • no forced compaction
  • no stop/interruption
  • emits console.warn(...) with session ID + current counters
  • records breach flags and first breach turn index in trace metadata (guardrails)

Model/provider normalization

Both the pi-session extension and system-bus Langfuse bridge normalize provider/model before writing tags, trace metadata, and generation model fields. This keeps provider:* + model:* tags aligned with metadata after model switches and for provider-prefixed IDs such as:

  • anthropic/claude-opus-4-6
  • openai-codex/gpt-5.4

Normalization is fail-open: tracing continues even if normalization cannot resolve a value.

Output-contract + usage-coverage signals (2026-03-02)

System-bus inference now emits explicit coverage/output-contract metadata so low-yield calls are queryable:

  • usageCoverage: "present"|"missing"
  • usageCaptured: boolean
  • jsonRequested, jsonParsed, outputChars
  • warning OTEL event: model_router.usage_missing

For strict machine-readable paths, callers can require output contracts:

  • requireJson: true — parse failure becomes inference failure
  • requireTextOutput: true — empty text becomes inference failure

Recall rewrite traces now include rewriteReason in addition to strategy (disabled|skipped|haiku|openai|fallback) to separate deliberate skips from failure fallbacks.

Known Gaps

IssueSeverityNotes
cache_write_input_tokens not pricedMediumLangfuse platform limitation — no cache write rate in their pricing table
No completionStartTime on first turnLowlastAssistantStartTime not set before first message_start[assistant]
tool_result matchingLowRelies on toolCallId — if pi changes the field name, spans won't close

Debugging

Check recent traces

LF_PK=$(secrets lease langfuse_public_key --ttl 5m)
LF_SK=$(secrets lease langfuse_secret_key --ttl 5m)
curl -s -u "$LF_PK:$LF_SK" "https://us.cloud.langfuse.com/api/public/traces?limit=5" \
  | jq '[.data[] | {name, ts: .timestamp[:19], obs: (.observations | length), output: (.output // "" | tostring | .[0:60])}]'

Check nested observations on a trace

TRACE_ID="<id>"
curl -s -u "$LF_PK:$LF_SK" "https://us.cloud.langfuse.com/api/public/observations?traceId=$TRACE_ID" \
  | jq '[.data[] | {name, type, model, startTime: .startTime[:19], endTime: .endTime[:19]}]'

Common Issues

SymptomCauseFix
Double tracesExtension loaded twice via symlink/realpath splitglobalThis dedup guard (already fixed)
[toolUse] output instead of tool namestool_call events not firingCheck pi version, verify toolName field on event
No traces at allLangfuse creds missingCheck LANGFUSE_PUBLIC_KEY/LANGFUSE_SECRET_KEY env
channel:interactive on gatewayGATEWAY_ROLE not setMust be in gateway-start.sh
Stale extension codeGateway/interactive session not reloaded after changeRestart gateway and start a fresh interactive session
OTEL emit errors in gatewaysystem-bus-worker port-forward downkubectl port-forward -n joelclaw svc/system-bus-worker 3111:3111

Key Files

  • Pi extension: pi/extensions/langfuse-cost/index.ts
  • Pi extension tests: pi/extensions/langfuse-cost/index.test.ts
  • System-bus bridge: packages/system-bus/src/lib/langfuse.ts
  • Gateway ops notes: docs/gateway.md

Deployment Workflow

After editing the pi extension:

  1. Commit changes in this repo (source of truth).
  2. Restart gateway so the updated extension is loaded.
  3. Start a new interactive pi session (or reload) so per-session tracing uses the new code.

ADRs

  • ADR-0146: Inference Cost Monitoring and Control — shipped
  • ADR-0147: Named Agent Profiles (trace attribution by role)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

agent-loop

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agent-mail

No summary provided by upstream source.

Repository SourceNeeds Review
Security

SealVera

Tamper-evident audit trail for AI agent decisions. Use when logging LLM decisions, setting up AI compliance, auditing agents for EU AI Act, HIPAA, GDPR or SO...

Registry SourceRecently Updated
0243
Profile unavailable