observability-control

Manage observability stack lifecycle (start, stop, backup, restore, upgrade). Use when controlling the LGTM stack for Claude Code monitoring.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "observability-control" with this command: npx skills add adaptationio/skrillz/adaptationio-skrillz-observability-control

Observability Control

Manage the lifecycle of the observability stack for Claude Code telemetry.

Stack Locations

EnvironmentDocker Compose Path
Primary Stack/mnt/c/data/github/botaniqal-medtech/botaniqal-medtech/observability/docker-compose.yml
Skill-based Stack/mnt/c/data/github/.observability/docker-compose.yml

Components

ServicePortPurpose
Grafana3000Dashboards and visualization
Prometheus9090Metrics storage
Loki3100Log aggregation
Tempo3200Distributed tracing
OTEL Collector4317/4318Telemetry receiver
Promtail-Log shipping

Operations

start

Start observability stack.

docker compose -f /mnt/c/data/github/botaniqal-medtech/botaniqal-medtech/observability/docker-compose.yml up -d

stop

Stop stack gracefully (preserves data).

docker compose -f /mnt/c/data/github/botaniqal-medtech/botaniqal-medtech/observability/docker-compose.yml down

restart [service]

Restart specific service or all services.

# Restart all
docker compose -f /path/docker-compose.yml restart

# Restart specific
docker restart loki

status

Health check all components.

docker ps --format "table {{.Names}}\t{{.Status}}" | grep -E "(otel|loki|grafana|prometheus|tempo)"

Output: Running services, health status.

health

Verify service endpoints.

curl -s http://localhost:3000/api/health  # Grafana
curl -s http://localhost:9090/-/healthy   # Prometheus
curl -s http://localhost:3100/ready       # Loki
curl -s http://localhost:3200/ready       # Tempo

backup

Export dashboards and configurations.

# Backup dashboards
curl -s http://localhost:3000/api/search -u admin:admin | \
  jq -r '.[].uid' | \
  xargs -I {} curl -s http://localhost:3000/api/dashboards/uid/{} -u admin:admin > backup/dashboards.json

Output: .observability/backups/YYYYMMDD_HHMMSS/

restore <backup-path>

Restore from backup.

curl -X POST http://localhost:3000/api/dashboards/db \
  -H "Content-Type: application/json" \
  -u admin:admin \
  -d @backup/dashboards.json

logs [service]

View logs from stack components.

docker logs loki --tail 100
docker logs otel-collector --tail 100
docker logs grafana --tail 100

fix-permissions

Fix volume permission issues (common with Tempo).

docker volume rm observability_tempo-data
docker volume create observability_tempo-data
docker run --rm -v observability_tempo-data:/tempo alpine chown -R 10001:10001 /tempo
docker restart tempo

Quick Commands

# Check all services status
docker ps | grep -E "(otel|loki|grafana|prometheus|tempo|promtail)"

# View recent logs for issues
docker logs otel-collector --tail 50 2>&1 | grep -i error

# Test OTLP endpoint
curl -v http://localhost:4317

# Query Loki for recent data
curl -s "http://localhost:3100/loki/api/v1/labels"

# List Grafana dashboards
curl -s http://localhost:3000/api/search -u admin:admin | python3 -c "import sys,json; [print(d['title']) for d in json.load(sys.stdin)]"

Troubleshooting

OTEL Collector Unhealthy

docker logs otel-collector --tail 30
# Common fix: Ensure Prometheus has --web.enable-remote-write-receiver

Loki Unhealthy

docker logs loki --tail 30
# Common fix: Disable frontend_worker for single-node mode

Tempo Permission Denied

# Fix volume permissions
docker volume rm observability_tempo-data
docker volume create observability_tempo-data
docker run --rm -v observability_tempo-data:/tempo alpine chown -R 10001:10001 /tempo
docker restart tempo

No Data in Grafana

  1. Check telemetry env vars: env | grep OTEL
  2. Check hooks configured: cat .claude/settings.json
  3. Verify Loki receiving: curl "http://localhost:3100/loki/api/v1/labels"

Access Points

ServiceURLCredentials
Grafanahttp://localhost:3000admin/admin
Prometheushttp://localhost:9090-
Lokihttp://localhost:3100-
OTLP gRPClocalhost:4317-
OTLP HTTPlocalhost:4318-

Scripts

  • scripts/start-stack.sh - Start observability stack
  • scripts/stop-stack.sh - Stop stack gracefully
  • scripts/health-check.sh - Check all service health
  • scripts/backup-dashboards.sh - Export Grafana dashboards
  • scripts/restore-dashboards.sh - Import dashboards from backup

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

supabase-cli

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

task-development

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

codex-cli

No summary provided by upstream source.

Repository SourceNeeds Review