Observability Patterns Skill

This skill provides comprehensive templates and configurations for implementing observability in Google ADK agents. Includes logging, tracing, BigQuery analytics, Cloud Trace integration, and third-party observability platforms.

Overview

Google ADK supports multiple observability approaches for monitoring, debugging, and analyzing agent behavior:

Cloud Trace - Google Cloud native tracing with OpenTelemetry
BigQuery Agent Analytics - Comprehensive event logging and analysis
AgentOps - Session replays and unified tracing analytics
Phoenix (Arize) - Open-source observability with self-hosted control
Weave (W&B) - Weights & Biases platform for tracking and visualization

This skill covers production-ready observability implementations with security and scalability.

Available Scripts

Setup Cloud Trace

Script: scripts/setup-cloud-trace.sh <project-id>

Purpose: Configures Cloud Trace integration for ADK agents

Parameters:

project-id
Google Cloud project ID (required)

Usage:

Setup Cloud Trace for local development

./scripts/setup-cloud-trace.sh my-project-id

Setup with ADK CLI deployment

adk deploy agent_engine --project=my-project-id --trace_to_cloud ./agent

Environment Variables:

GOOGLE_CLOUD_PROJECT
Project ID for Cloud Trace
GOOGLE_APPLICATION_CREDENTIALS
Path to service account key

Output: Cloud Trace enabled, traces visible in console.cloud.google.com

Setup BigQuery Agent Analytics

Script: scripts/setup-bigquery-analytics.sh <project-id> <dataset-id> [bucket-name]

Purpose: Configures BigQuery Agent Analytics plugin for comprehensive event logging

Parameters:

project-id
Google Cloud project ID (required)
dataset-id
BigQuery dataset name (required)
bucket-name
GCS bucket for multimodal content (optional)

Usage:

Setup basic BigQuery analytics

./scripts/setup-bigquery-analytics.sh my-project agent-analytics

Setup with GCS for multimodal content

./scripts/setup-bigquery-analytics.sh my-project agent-analytics my-content-bucket

Create dataset and table

bq mk --dataset my-project:agent-analytics bq mk --table agent-analytics.agent_events_v2 templates/bigquery-schema.json

IAM Requirements:

roles/bigquery.jobUser
Required for BigQuery operations
roles/bigquery.dataEditor
Required for writing data
roles/storage.objectCreator
Required if using GCS offloading

Output: BigQuery table created, events streaming to dataset

Setup AgentOps

Script: scripts/setup-agentops.sh

Purpose: Configures AgentOps integration for session replays and metrics

Usage:

Install AgentOps

pip install -U agentops

Setup with API key

AGENTOPS_API_KEY=your_api_key_here ./scripts/setup-agentops.sh

Verify setup

python -c "import agentops; agentops.init(); print('AgentOps ready')"

Environment Variables:

AGENTOPS_API_KEY
AgentOps API key from app.agentops.ai/settings/projects

Output: AgentOps initialized, sessions visible in dashboard

Setup Phoenix

Script: scripts/setup-phoenix.sh

Purpose: Configures Phoenix (Arize) integration for open-source observability

Usage:

Install Phoenix packages

pip install openinference-instrumentation-google-adk arize-phoenix-otel

Setup Phoenix with API key

PHOENIX_API_KEY=your_key_here
PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/s/your-space
./scripts/setup-phoenix.sh

Verify Phoenix connection

python scripts/verify-phoenix.py

Environment Variables:

PHOENIX_API_KEY
Phoenix API key from phoenix.arize.com
PHOENIX_COLLECTOR_ENDPOINT
Phoenix collector endpoint URL

Output: Phoenix tracer initialized, traces visible in Phoenix dashboard

Setup Weave

Script: scripts/setup-weave.sh <entity> <project>

Purpose: Configures Weave (W&B) integration for observability

Parameters:

entity
W&B entity name (visible in Teams sidebar)
project
W&B project name

Usage:

Install Weave dependencies

pip install opentelemetry-sdk opentelemetry-exporter-otlp-proto-http

Setup Weave with API key

WANDB_API_KEY=your_wandb_key_here ./scripts/setup-weave.sh my-team my-project

Verify Weave connection

python scripts/verify-weave.py

Environment Variables:

WANDB_API_KEY
W&B API key from wandb.ai/authorize

Output: Weave tracer initialized, traces visible in Weave dashboard

Validate Observability Setup

Script: scripts/validate-observability.sh

Purpose: Validates observability configuration and connectivity

Checks:

Cloud Trace connectivity
BigQuery dataset and table existence
AgentOps initialization
Phoenix endpoint reachability
Weave endpoint reachability
IAM permissions
Environment variables set

Usage:

Validate all observability configurations

./scripts/validate-observability.sh

Validate specific tool

./scripts/validate-observability.sh --tool=bigquery ./scripts/validate-observability.sh --tool=cloud-trace ./scripts/validate-observability.sh --tool=agentops

Exit Codes:

0
All checks passed
1
Configuration missing
2
Connectivity failed
3
Permission issues

Available Templates

Cloud Trace Configuration

Template: templates/cloud-trace-config.py

Purpose: Cloud Trace integration for ADK agents

Features:

OpenTelemetry configuration
Automatic span creation for agent runs
LLM and tool call tracing
Error and latency tracking

Usage:

Enable Cloud Trace via ADK CLI

adk deploy agent_engine --project=$GOOGLE_CLOUD_PROJECT --trace_to_cloud ./agent

Or via Python SDK

from google.adk.app import AdkApp

app = AdkApp( agent=my_agent, enable_tracing=True )

Span Labels:

invocation
Top-level agent invocation
agent_run
Individual agent execution
call_llm
LLM API calls
execute_tool
Tool executions

BigQuery Analytics Configuration

Template: templates/bigquery-analytics-config.py

Purpose: Complete BigQuery Agent Analytics plugin configuration

Features:

Asynchronous event logging
Multimodal content with GCS offloading
OpenTelemetry-style tracing (trace_id, span_id)
Event filtering and batching
Custom content formatting

Usage:

from google.adk.plugins.bigquery_agent_analytics_plugin import ( BigQueryAgentAnalyticsPlugin, BigQueryLoggerConfig )

bq_config = BigQueryLoggerConfig( enabled=True, gcs_bucket_name="your-bucket-name", max_content_length=500 * 1024, # 500KB inline limit batch_size=1, # Low latency event_allowlist=["LLM_RESPONSE", "TOOL_COMPLETED"] )

plugin = BigQueryAgentAnalyticsPlugin( project_id="your-project-id", dataset_id="your-dataset-id", config=bq_config )

app = App(root_agent=agent, plugins=[plugin])

Configuration Options:

enabled
Toggle logging on/off
gcs_bucket_name
GCS bucket for large content
max_content_length
Inline text limit (default 500KB)
batch_size
Events per write (default 1)
event_allowlist
Whitelist specific event types
event_denylist
Blacklist specific event types
content_formatter
Custom formatting function

BigQuery Schema

Template: templates/bigquery-schema.json

Purpose: BigQuery table schema for agent_events_v2

Schema Fields:

timestamp
Event recording time
event_type
Event category (LLM_REQUEST, TOOL_STARTING, etc.)
content
Event-specific JSON payload
content_parts
Structured multimodal data
trace_id
OpenTelemetry trace ID
span_id
OpenTelemetry span ID
agent
Agent name
user_id
User identifier

Partitioning: By DATE(timestamp) for cost optimization

Clustering: By event_type, agent, user_id for query performance

AgentOps Configuration

Template: templates/agentops-config.py

Purpose: AgentOps integration for session replays

Features:

Minimal two-line integration
Hierarchical span visualization
LLM call tracking with prompts and completions
Token count and latency metrics
Cost tracking

Usage:

import agentops

Initialize AgentOps (before ADK imports)

agentops.init()

Your ADK agent code

from google.adk.app import App app = App(root_agent=my_agent)

Span Hierarchy:

Agent spans: Named adk.agent.{AgentName}
LLM spans: Capture prompts, completions, tokens
Tool spans: Record parameters and results

Phoenix Configuration

Template: templates/phoenix-config.py

Purpose: Phoenix (Arize) integration for open-source observability

Features:

Self-hosted data control
OpenInference instrumentation
Trace evaluation
Performance debugging
Custom evaluators

Usage:

import os from phoenix.otel import register

Set Phoenix credentials

os.environ["PHOENIX_API_KEY"] = "your_api_key_here" os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = "https://app.phoenix.arize.com/s/your-space"

Register Phoenix tracer

tracer_provider = register( project_name="my-adk-agent", auto_instrument=True )

Your ADK agent code (Phoenix auto-captures traces)

from google.adk.app import App app = App(root_agent=my_agent)

Auto-Instrumentation: Phoenix automatically traces all ADK operations

Weave Configuration

Template: templates/weave-config.py

Purpose: Weave (W&B) integration for observability

Features:

Timeline of agent calls
Tool invocation tracking
Reasoning process analysis
Span hierarchy visualization
Dashboard integration

Usage:

import os from opentelemetry import trace from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import SimpleSpanProcessor from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter import base64

Setup Weave exporter

wandb_api_key = os.environ["WANDB_API_KEY"] entity = "your-entity" project = "your-project"

auth_string = f"api:{wandb_api_key}" encoded_auth = base64.b64encode(auth_string.encode()).decode()

exporter = OTLPSpanExporter( endpoint="https://trace.wandb.ai/otel/v1/traces", headers={ "Authorization": f"Basic {encoded_auth}", "project_id": f"{entity}/{project}" } )

Configure tracer provider (BEFORE ADK imports)

provider = TracerProvider() provider.add_span_processor(SimpleSpanProcessor(exporter)) trace.set_tracer_provider(provider)

Your ADK agent code

from google.adk.app import App app = App(root_agent=my_agent)

Critical: Set tracer provider before importing ADK components

Available Examples

Complete Observability Setup

Example: examples/complete-observability.md

Covers:

Multi-tool observability setup
Cloud Trace + BigQuery combination
Third-party tool integration
Production deployment patterns
Cost optimization strategies

Step-by-Step Guide:

Enable Cloud Trace for distributed tracing
Configure BigQuery for event logging
Add AgentOps for session replays
Optional: Phoenix or Weave for additional insights
Validate all configurations
Deploy to production

Production Checklist:

Cloud Trace enabled in production
BigQuery dataset created with proper IAM
GCS bucket configured for multimodal content
Event filtering configured to control costs
Alert rules defined for error rates
Dashboard created for key metrics
Retention policies set for cost control

BigQuery Analytics Queries

Example: examples/bigquery-queries.md

Covers:

Conversation trace retrieval
Token usage analysis
Error rate tracking
Tool usage statistics
Performance metrics
Cost analysis

Query Examples:

-- Retrieve conversation traces SELECT timestamp, event_type, JSON_VALUE(content, '$.response') FROM agent_events_v2 WHERE trace_id = 'your-trace-id' ORDER BY timestamp ASC;

-- Token usage by agent SELECT agent, AVG(CAST(JSON_VALUE(content, '$.usage.total') AS INT64)) as avg_tokens, SUM(CAST(JSON_VALUE(content, '$.usage.total') AS INT64)) as total_tokens FROM agent_events_v2 WHERE event_type = 'LLM_RESPONSE' GROUP BY agent;

-- Error rate by event type SELECT event_type, COUNT(*) as error_count, DATE(timestamp) as day FROM agent_events_v2 WHERE event_type LIKE '%ERROR%' GROUP BY event_type, day ORDER BY day DESC, error_count DESC;

-- Tool usage frequency SELECT JSON_VALUE(content, '$.tool_name') as tool, COUNT(*) as usage_count FROM agent_events_v2 WHERE event_type = 'TOOL_COMPLETED' GROUP BY tool ORDER BY usage_count DESC;

-- Access multimodal content from GCS SELECT part.mime_type, part.object_ref.uri as gcs_uri FROM agent_events_v2, UNNEST(content_parts) AS part WHERE part.storage_mode = 'GCS_REFERENCE';

Multi-Tool Integration

Example: examples/multi-tool-integration.md

Covers:

Using multiple observability tools together
Cloud Trace + BigQuery + AgentOps
Data correlation across platforms
Tool selection criteria
Cost vs. insight tradeoffs

Integration Patterns:

Pattern 1: Google Cloud Native

Cloud Trace for distributed tracing
BigQuery for detailed event analysis
Best for: GCP-centric deployments

Pattern 2: Comprehensive Monitoring

Cloud Trace for infrastructure tracing
AgentOps for session replays
BigQuery for analytics
Best for: Production monitoring with detailed debugging

Pattern 3: Open Source

Phoenix for self-hosted observability
BigQuery for long-term storage
Best for: Data sovereignty requirements

Pattern 4: ML-Focused

Weave for experiment tracking
BigQuery for analytics
Best for: Research and experimentation

Production Deployment

Example: examples/production-deployment.md

Covers:

Production-ready observability configuration
IAM role setup
Cost optimization
Alert configuration
Dashboard creation
Incident response

Production Setup:

IAM Configuration:

Service account with minimal permissions
Separate dev/staging/prod credentials
Workload Identity for GKE deployments

Cost Controls:

Event filtering to reduce BigQuery writes
GCS lifecycle policies for multimodal content
Table partitioning and clustering
Retention policies (30-90 days)

Monitoring:

Cloud Monitoring alerts for error rates
BigQuery query dashboard in Looker Studio
AgentOps session replay for debugging
Trace analysis for performance issues

Security:

No credentials in code (environment variables only)
VPC Service Controls for data protection
Customer-managed encryption keys (CMEK)
Audit logging for compliance

Security Compliance

CRITICAL: This skill follows strict security rules:

❌ NEVER hardcode:

API keys (AgentOps, Phoenix, Weave, W&B)
Google Cloud credentials
Service account keys
OAuth tokens
BigQuery connection strings

✅ ALWAYS:

Use environment variables for secrets
Generate .env.example with placeholders
Add .env* to .gitignore
Use Google Application Default Credentials
Document credential acquisition process
Use IAM roles instead of service account keys when possible

Placeholder format:

.env.example

GOOGLE_CLOUD_PROJECT=your-project-id AGENTOPS_API_KEY=your_agentops_key_here PHOENIX_API_KEY=your_phoenix_key_here PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/s/your-space WANDB_API_KEY=your_wandb_key_here

Progressive Disclosure

This skill provides immediate setup guidance with references to detailed documentation:

Quick Start: Use setup scripts for immediate configuration
Production: Reference production-deployment.md for complete guide
Analytics: Use bigquery-queries.md for query templates
Integration: Reference multi-tool-integration.md for advanced patterns

Load additional files only when specific customization is needed.

Common Workflows

Local Development Setup

Enable Cloud Trace for local debugging

export GOOGLE_CLOUD_PROJECT=your-project-id ./scripts/setup-cloud-trace.sh your-project-id

Start agent with tracing

python my_agent.py

View traces at console.cloud.google.com/traces

Production Deployment with BigQuery

1. Create BigQuery dataset

bq mk --dataset my-project:agent-analytics

2. Create events table

bq mk --table agent-analytics.agent_events_v2 templates/bigquery-schema.json

3. Create GCS bucket for multimodal content

gsutil mb gs://my-agent-content/

4. Setup BigQuery analytics

./scripts/setup-bigquery-analytics.sh my-project agent-analytics my-agent-content

5. Deploy agent

adk deploy agent_engine --project=my-project ./agent

6. Validate setup

./scripts/validate-observability.sh --tool=bigquery

Multi-Tool Integration

1. Setup Cloud Trace

export GOOGLE_CLOUD_PROJECT=your-project-id ./scripts/setup-cloud-trace.sh your-project-id

2. Setup BigQuery Analytics

./scripts/setup-bigquery-analytics.sh your-project agent-analytics my-bucket

3. Setup AgentOps

export AGENTOPS_API_KEY=your_key_here ./scripts/setup-agentops.sh

4. Validate all configurations

./scripts/validate-observability.sh

Troubleshooting

Cloud Trace Not Showing Traces

Check:

GOOGLE_CLOUD_PROJECT environment variable is set
Cloud Trace API is enabled
Service account has roles/cloudtrace.agent
Tracer initialized before ADK imports

Debug:

Check Cloud Trace API status

gcloud services list --enabled | grep cloudtrace

Enable Cloud Trace API

gcloud services enable cloudtrace.googleapis.com

Test trace export

python scripts/test-cloud-trace.py

BigQuery Events Not Appearing

Check:

Dataset and table exist
Service account has correct IAM roles
BigQuery API is enabled
Plugin configuration is correct
No event filtering blocking events

Debug:

Check dataset exists

bq ls my-project:

Check table schema

bq show --schema agent-analytics.agent_events_v2

Check IAM permissions

gcloud projects get-iam-policy my-project
--flatten="bindings[].members"
--filter="bindings.members:serviceAccount:YOUR_SA_EMAIL"

Test plugin manually

python scripts/test-bigquery-plugin.py

AgentOps Not Capturing Traces

Check:

AgentOps initialized before ADK imports
API key is valid
Network connectivity to app.agentops.ai
AgentOps package version is latest

Fix:

Update AgentOps

pip install -U agentops

Test initialization

python -c "import agentops; agentops.init(); print('Success')"

Check for conflicts with other tracers

Ensure AgentOps is initialized first

Phoenix Connection Failed

Check:

Phoenix API key is valid
Collector endpoint URL is correct
Network access to Phoenix endpoint
Required packages installed

Debug:

Test Phoenix endpoint

curl -H "Authorization: Bearer YOUR_KEY"
https://app.phoenix.arize.com/s/YOUR_SPACE

Verify package versions

pip list | grep -E "(openinference|phoenix)"

Run verification script

python scripts/verify-phoenix.py

Weave Traces Not Appearing

Check:

Tracer provider set BEFORE ADK imports
W&B API key is valid
Entity and project names are correct
OTEL exporter configured properly

Fix:

Verify initialization order

1. Import OTEL packages

2. Configure and set tracer provider

3. THEN import ADK

Correct order:

from opentelemetry import trace from opentelemetry.sdk.trace import TracerProvider trace.set_tracer_provider(TracerProvider()) # FIRST

from google.adk.app import App # THEN

Dependencies

Required:

google-adk>=1.21.0
ADK framework (version 1.21.0+ for full BigQuery features)
google-cloud-trace>=1.13.0
Cloud Trace client (optional)
google-cloud-bigquery>=3.0.0
BigQuery client (optional)

Optional (Third-party tools):

agentops>=0.3.0
AgentOps integration
openinference-instrumentation-google-adk>=0.1.0
Phoenix instrumentation
arize-phoenix-otel>=0.1.0
Phoenix OTEL exporter
opentelemetry-sdk>=1.20.0
OpenTelemetry SDK for Weave
opentelemetry-exporter-otlp-proto-http>=1.20.0
OTLP exporter for Weave

Installation:

Core ADK with Cloud Trace

pip install google-adk google-cloud-trace

With BigQuery Analytics

pip install google-adk google-cloud-bigquery

With AgentOps

pip install google-adk agentops

With Phoenix

pip install google-adk openinference-instrumentation-google-adk arize-phoenix-otel

With Weave

pip install google-adk opentelemetry-sdk opentelemetry-exporter-otlp-proto-http

All observability tools

pip install google-adk google-cloud-trace google-cloud-bigquery agentops
openinference-instrumentation-google-adk arize-phoenix-otel
opentelemetry-sdk opentelemetry-exporter-otlp-proto-http

Best Practices

Multi-Layer Observability: Use Cloud Trace for infrastructure, BigQuery for analytics, and AgentOps for debugging
Cost Control: Implement event filtering and retention policies to manage BigQuery costs
Security: Never hardcode credentials; use environment variables and IAM roles
Progressive Rollout: Start with Cloud Trace, add BigQuery when analytics needed
Tool Selection: Choose tools based on requirements (open-source vs. managed, cost vs. features)
Data Correlation: Use trace_id across all tools for unified debugging
Alert Configuration: Set up alerts for error rates, latency spikes, and cost anomalies
Dashboard Creation: Build custom dashboards in Looker Studio, Grafana, or tool-native UIs

Additional Resources

Cloud Trace: https://cloud.google.com/trace/docs
BigQuery Agent Analytics: https://google.github.io/adk-docs/observability/bigquery-agent-analytics/
AgentOps: https://app.agentops.ai/
Phoenix (Arize): https://arize.com/docs/phoenix/
Weave (W&B): https://docs.wandb.ai/weave/
ADK Observability Guide: https://google.github.io/adk-docs/observability/
OpenTelemetry: https://opentelemetry.io/docs/

Tool Comparison

Feature Cloud Trace BigQuery AgentOps Phoenix Weave

Hosting Google Cloud Google Cloud SaaS SaaS/Self-hosted SaaS

Cost Free tier + usage Storage + queries Free tier + paid Free tier + paid Free tier + paid

Setup Complexity Low Medium Very Low Low Medium

Data Control Google Cloud Google Cloud Third-party Self-host option Third-party

Query Flexibility Low Very High Medium High Medium

Real-time Yes Near real-time Yes Yes Yes

Custom Dashboards Limited Full (Looker) Built-in Built-in Built-in

Best For Infrastructure tracing Deep analytics Quick debugging Open-source, control ML experiments

observability-patterns

Safety Notice

Copy this and send it to your AI assistant to learn

Setup Cloud Trace for local development

Setup with ADK CLI deployment

Setup basic BigQuery analytics

Setup with GCS for multimodal content

Create dataset and table

Install AgentOps

Setup with API key

Verify setup

Install Phoenix packages

Setup Phoenix with API key

Verify Phoenix connection

Install Weave dependencies

Setup Weave with API key

Verify Weave connection

Validate all observability configurations

Validate specific tool

Enable Cloud Trace via ADK CLI

Or via Python SDK

Initialize AgentOps (before ADK imports)

Your ADK agent code

Set Phoenix credentials

Register Phoenix tracer

Your ADK agent code (Phoenix auto-captures traces)

Setup Weave exporter

Configure tracer provider (BEFORE ADK imports)

Your ADK agent code

.env.example

Enable Cloud Trace for local debugging

Start agent with tracing

View traces at console.cloud.google.com/traces

1. Create BigQuery dataset

2. Create events table

3. Create GCS bucket for multimodal content

4. Setup BigQuery analytics

5. Deploy agent

6. Validate setup

1. Setup Cloud Trace

2. Setup BigQuery Analytics

3. Setup AgentOps

4. Validate all configurations

Check Cloud Trace API status

Enable Cloud Trace API

Test trace export

Check dataset exists

Check table schema

Check IAM permissions

Test plugin manually

Update AgentOps

Test initialization

Check for conflicts with other tracers

Ensure AgentOps is initialized first

Test Phoenix endpoint

Verify package versions

Run verification script

Verify initialization order

1. Import OTEL packages

2. Configure and set tracer provider

3. THEN import ADK

Correct order:

Core ADK with Cloud Trace

With BigQuery Analytics

With AgentOps

With Phoenix

With Weave

All observability tools

Source Transparency

Related Skills

document-parsers

stt-integration

model-routing-patterns