observability-lgtm

Set up a full local LGTM observability stack (Loki + Grafana + Tempo + Prometheus + Alloy) for FastAPI apps. One Docker Compose, one Python import, unified dashboards.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "observability-lgtm" with this command: npx skills add nissan/observability-lgtm

observability-lgtm

Set up a full local observability stack (Loki + Grafana + Tempo + Prometheus + Alloy) for FastAPI apps on macOS (Apple Silicon) or Linux. One command to start, one import to instrument any app. Logs → Loki, metrics → Prometheus, traces → Tempo, all unified in Grafana.

When to use

  • User is building a FastAPI web app and wants logs, metrics, and traces
  • User wants a local Grafana dashboard without setting up ELK (too heavy)
  • User wants to correlate logs ↔ traces ↔ metrics in one UI
  • User has multiple local apps and wants universal observability

When NOT to use

  • Production cloud deployments (use managed Grafana Cloud or Datadog instead)
  • Non-Python apps (the Python lib only works for FastAPI; the stack itself is language-agnostic)
  • When Docker is not available

Prerequisites

  • Docker + Docker Compose v2 installed
  • Python 3.10+ (for the instrumentation lib)
  • FastAPI app to instrument

What gets installed

ServicePortPurpose
Grafana3000Dashboards — no login in dev mode
Prometheus9091Metrics scraping (avoids 9090 if MinIO running)
Loki3300Log storage (avoids 3100 if Langfuse running)
Tempo gRPC4317OTLP trace receiver
Tempo HTTP4318OTLP HTTP alternative
Alloy UI12345Agent status

Steps

Step 1 — Check for port conflicts

lsof -iTCP -sTCP:LISTEN -n -P 2>/dev/null | grep -E ":(3000|3300|9091|4317|4318|12345)" | awk '{print $9, $1}'

If any of the ports above are in use, update the relevant port in docker-compose.yml and the matching url: in config/grafana/provisioning/datasources/datasources.yml. Common conflicts: Langfuse on 3100, MinIO on 9090.

Step 2 — Copy the stack

Copy these files from the skill directory into a projects/observability/ folder in the workspace:

  • assets/docker-compose.yml
  • assets/config/ (entire directory tree)
  • assets/lib/observability.py
  • assets/scripts/register_app.sh
mkdir -p projects/observability
cp -r SKILL_DIR/assets/* projects/observability/
mkdir -p projects/observability/logs
touch projects/observability/logs/.gitkeep
chmod +x projects/observability/scripts/register_app.sh

Step 3 — Start the stack

cd projects/observability
docker compose up -d

Wait ~15 seconds for all services to start, then verify:

curl -s -o /dev/null -w "Grafana: %{http_code}\n"    http://localhost:3000/api/health
curl -s -o /dev/null -w "Prometheus: %{http_code}\n" http://localhost:9091/-/healthy
curl -s -o /dev/null -w "Loki: %{http_code}\n"       http://localhost:3300/ready
curl -s -o /dev/null -w "Tempo: %{http_code}\n"      http://localhost:4318/ready

All should return 200. If Loki or Tempo return 503, wait 10 more seconds and retry (they have a slower startup than Grafana/Prometheus).

Step 4 — Install Python deps for the app

pip install \
  "prometheus-fastapi-instrumentator>=7.0.0" \
  "opentelemetry-sdk>=1.25.0" \
  "opentelemetry-exporter-otlp-proto-grpc>=1.25.0" \
  "opentelemetry-instrumentation-fastapi>=0.46b0" \
  "python-json-logger>=2.0.7"

Step 5 — Instrument the FastAPI app

Add to the app's app.py (or main.py), just after app = FastAPI(...):

import sys
sys.path.insert(0, "path/to/projects/observability/lib")
from observability import setup_observability
logger = setup_observability(app, service_name="my-service-name")

That's it. The app now:

  • Exposes /metrics for Prometheus
  • Writes JSON logs to projects/observability/logs/my-service-name/app.log
  • Sends traces to Tempo on localhost:4317

Step 6 — Register with Prometheus

cd projects/observability
./scripts/register_app.sh my-service-name <port>
# e.g.: ./scripts/register_app.sh image-gen-studio 7860

Prometheus hot-reloads the target within 30 seconds. Verify:

curl -s "http://localhost:9091/api/v1/targets" | python3 -c "
import json, sys
data = json.load(sys.stdin)
for t in data['data']['activeTargets']:
    svc = t['labels'].get('service', '')
    print(svc, '->', t['health'])
"

Step 7 — Open Grafana

Open http://localhost:3000

The FastAPI — App Overview dashboard is pre-loaded. Select your service from the dropdown at the top. You'll see:

  • Request rate (req/s)
  • Error rate (%)
  • Latency p50/p95/p99
  • Requests by endpoint
  • HTTP status codes
  • Live log panel (Loki)

To jump from a log line to its trace: click the trace_id link in the log detail panel. It opens the full trace in Tempo automatically (datasource pre-wired).

Step 8 — Import additional dashboards (optional)

In Grafana → Dashboards → Import:

  • 16110 — FastAPI Observability (richer alternative to the built-in)
  • 13407 — Loki Logs Overview
  • 16112 — Tempo Service Graph (service dependency map)

Useful commands

# Reload Prometheus config after registering a new app:
curl -s -X POST http://localhost:9091/-/reload

# Restart a single service without losing data:
docker compose -f projects/observability/docker-compose.yml restart grafana

# Stop everything (data volumes preserved):
docker compose -f projects/observability/docker-compose.yml down

# Nuclear reset (wipes all stored data):
docker compose -f projects/observability/docker-compose.yml down -v

# Check Alloy log shipping status:
open http://localhost:12345

Manual tracing (optional)

from observability import get_tracer
tracer = get_tracer(__name__)

@app.get("/expensive-endpoint")
async def handler():
    with tracer.start_as_current_span("db-query") as span:
        span.set_attribute("db.table", "users")
        result = await db.query(...)
    return result

Log/trace correlation

The OTel instrumentation injects trace_id into every log record. Grafana Loki is pre-configured with a derived field that turns "trace_id":"abc123" into a clickable link to the Tempo trace.

To manually include trace context in your own log calls:

from opentelemetry import trace

def trace_ctx() -> dict:
    ctx = trace.get_current_span().get_span_context()
    return {"trace_id": format(ctx.trace_id, "032x")} if ctx.is_valid else {}

logger.info("Processing request", extra=trace_ctx())

Notes

  • Logs are written to projects/observability/logs/<service>/app.log as JSON. Alloy tails these files and ships to Loki — no code changes needed beyond setup_observability().
  • All observability is local — no data leaves the machine.
  • data_classification: LOCAL_ONLY is the default for all traces/logs.
  • The Alloy config drops DEBUG-level logs by default. Edit config/alloy/config.alloy to remove the stage.drop block if you need debug logs.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

A Python CLI skill for Cutout.Pro visual APIs — background removal, face cutout, and photo enhancement. Supports file upload & image URL input.

Call Cutout.Pro visual processing APIs to perform background removal, face cutout, and photo enhancement. Supports both file upload and image URL input, retu...

Registry SourceRecently Updated
Coding

client-onboarding-agent

Client onboarding and business diagnostic framework for AI agent deployments. Covers 4-round diagnostic process, 6 constraint categories, deployment SOP with...

Registry SourceRecently Updated
Coding

Ai Tools

AI Tools Box - Search and invoke 100+ AI tools. Categories: Writing, Image, Video, Coding, Office, Search, Chat, Audio, Design, Agent, Translation, Dev Platf...

Registry SourceRecently Updated