monitoring-expert

- Analysis: Understand the monitoring requirements for the application or infrastructure.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "monitoring-expert" with this command: npx skills add paulund/skills/paulund-skills-monitoring-expert

Monitoring Expert

Core Workflow

  • Analysis: Understand the monitoring requirements for the application or infrastructure.

  • Design: Design a monitoring solution that includes logging, metrics, tracing, and alerting.

  • Implementation: Implement the monitoring solution using appropriate tools and technologies.

  • Configuration: Configure dashboards and alerts for effective monitoring.

  • Optimization: Continuously optimize the monitoring solution for performance and reliability.

  • Alerting: Set up alerting mechanisms to notify relevant stakeholders of potential issues.

Reference Guide

Load the detailed guidance based on context:

Topic Reference Load When

Alerting Rules references/alerting-rules.md

When configuring alerting systems

Capacity Planning references/capacity-planning.md

When planning for resource growth or scaling

Dashboards references/dashboards.md

When building or reviewing monitoring dashboards

OpenTelemetry references/opentelemetry.md

When implementing distributed tracing or OTel instrumentation

Performance Testing references/performance-testing.md

When load testing or benchmarking systems

Prometheus Metrics references/prometheus-metrics.md

When defining or querying Prometheus metrics

Structured Logging references/structured-logging.md

When implementing application logging

Constraints

MUST DO

  • Use structured JSON logging for better log management.

  • Include request IDs in logs for traceability.

  • Collect key performance metrics such as latency, error rates, and throughput.

  • Set up alerts for critical paths.

  • Use appropriate metrics aggregation methods (e.g., rate, histogram) based on the metric type.

  • Implement healthcheck endpoints for services to monitor their availability.

MUST NOT DO

  • Avoid logging sensitive information such as passwords or personal data.

  • Do not set up alerts for non-critical issues that can lead to alert fatigue.

  • Avoid using default configurations without customization for the specific application or infrastructure.

  • Do not ignore monitoring data when troubleshooting issues.

  • Avoid over-instrumentation that can lead to performance overhead.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

api-developer

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

webhook-developer

No summary provided by upstream source.

Repository SourceNeeds Review
Research

skill-creator

Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.

Repository SourceNeeds Review
84.8K94.2Kanthropics
Research

slack-gif-creator

Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users request animated GIFs for Slack like "make me a GIF of X doing Y for Slack."

Repository Source
12.1K94.2Kanthropics