observability

Discrete events with context.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "observability" with this command: npx skills add dralgorhythm/claude-agentic-framework/dralgorhythm-claude-agentic-framework-observability

Observability

Three Pillars

  1. Logs

Discrete events with context.

{ "timestamp": "2024-01-01T12:00:00Z", "level": "error", "message": "Failed to process order", "orderId": "123", "error": "Payment declined", "traceId": "abc123" }

  1. Metrics

Numeric measurements over time.

http_requests_total{method="GET", status="200"} 1234 http_request_duration_seconds{quantile="0.95"} 0.23

  1. Traces

Request flow through services.

Trace: abc123 ├── API Gateway (50ms) │ ├── Auth Service (10ms) │ └── Order Service (35ms) │ └── Database (20ms)

OpenTelemetry Setup

import { NodeSDK } from '@opentelemetry/sdk-node'; import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';

const sdk = new NodeSDK({ traceExporter: new OTLPTraceExporter({ url: 'http://collector:4318/v1/traces', }), serviceName: 'my-service', });

sdk.start();

Key Metrics

RED Method (Request-focused)

  • Rate: Requests per second

  • Errors: Failed requests per second

  • Duration: Request latency

USE Method (Resource-focused)

  • Utilization: % time busy

  • Saturation: Queue depth

  • Errors: Error count

Alerting

Good Alerts

  • Actionable: Something can be done

  • Urgent: Needs immediate attention

  • Specific: Clear what's wrong

Alert Template

alert: HighErrorRate expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.01 for: 5m labels: severity: critical annotations: summary: "High error rate on {{ $labels.service }}" description: "Error rate is {{ $value | humanizePercentage }}"

Dashboards

Essential panels:

  • Request rate

  • Error rate

  • Latency (P50, P95, P99)

  • Saturation (CPU, memory)

  • Active alerts

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

react-native-reanimated

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

cloud-native-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

brainstorming

No summary provided by upstream source.

Repository SourceNeeds Review