Prometheus Monitoring
Table of Contents
Overview
Implement comprehensive Prometheus monitoring infrastructure for collecting, storing, and querying time-series metrics from applications and infrastructure.
When to Use
- Setting up metrics collection
- Creating custom application metrics
- Configuring scraping targets
- Implementing service discovery
- Building monitoring infrastructure
Quick Start
Minimal working example:
# /etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: production
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"]
rule_files:
- "/etc/prometheus/alert_rules.yml"
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "node"
static_configs:
- targets: ["localhost:9100"]
- job_name: "api-service"
// ... (see reference guides for full implementation)
Reference Guides
Detailed implementations in the references/ directory:
| Guide | Contents |
|---|---|
| Prometheus Configuration | Prometheus Configuration |
| Node.js Metrics Implementation | Node.js Metrics Implementation |
| Python Prometheus Integration | Python Prometheus Integration |
| Alert Rules | Alert Rules |
| Docker Compose Setup | Docker Compose Setup |
Best Practices
✅ DO
- Use consistent metric naming conventions
- Add comprehensive labels for filtering
- Set appropriate scrape intervals (10-60s)
- Implement retention policies
- Monitor Prometheus itself
- Test alert rules before deployment
- Document metric meanings
❌ DON'T
- Add unbounded cardinality labels
- Scrape too frequently (< 10s)
- Ignore metric naming conventions
- Create alerts without runbooks
- Store raw event data in Prometheus
- Use counters for gauge-like values