ELK Stack

Centralize and analyze logs with Elasticsearch, Logstash, and Kibana.

When to Use This Skill

Use this skill when:

Centralizing logs from multiple sources
Building log search and analytics platforms
Creating log-based dashboards and alerts
Implementing full-text search for logs
Processing and transforming log data

Prerequisites

Docker or server infrastructure
Sufficient disk space for log storage
Network access from log sources

Docker Deployment

docker-compose.yml

version: '3.8'

services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0 environment: - discovery.type=single-node - xpack.security.enabled=false - "ES_JAVA_OPTS=-Xms1g -Xmx1g" ports: - "9200:9200" volumes: - elasticsearch-data:/usr/share/elasticsearch/data

logstash: image: docker.elastic.co/logstash/logstash:8.11.0 volumes: - ./logstash/pipeline:/usr/share/logstash/pipeline - ./logstash/config:/usr/share/logstash/config ports: - "5044:5044" - "5000:5000" depends_on: - elasticsearch

kibana: image: docker.elastic.co/kibana/kibana:8.11.0 ports: - "5601:5601" environment: - ELASTICSEARCH_HOSTS=http://elasticsearch:9200 depends_on: - elasticsearch

filebeat: image: docker.elastic.co/beats/filebeat:8.11.0 user: root volumes: - ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro - /var/lib/docker/containers:/var/lib/docker/containers:ro - /var/run/docker.sock:/var/run/docker.sock:ro depends_on: - logstash

volumes: elasticsearch-data:

Elasticsearch Configuration

Index Templates

PUT _index_template/logs-template { "index_patterns": ["logs-*"], "template": { "settings": { "number_of_shards": 1, "number_of_replicas": 1, "index.lifecycle.name": "logs-policy" }, "mappings": { "properties": { "@timestamp": { "type": "date" }, "message": { "type": "text" }, "level": { "type": "keyword" }, "service": { "type": "keyword" }, "host": { "type": "keyword" }, "trace_id": { "type": "keyword" } } } } }

Index Lifecycle Management

PUT _ilm/policy/logs-policy { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "rollover": { "max_size": "50GB", "max_age": "1d" } } }, "warm": { "min_age": "7d", "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 } } }, "cold": { "min_age": "30d", "actions": { "freeze": {} } }, "delete": { "min_age": "90d", "actions": { "delete": {} } } } } }

Logstash Pipeline

Basic Pipeline

logstash/pipeline/main.conf

input { beats { port => 5044 }

tcp { port => 5000 codec => json_lines } }

filter {

Parse JSON logs

if [message] =~ /^{/ { json { source => "message" } }

Parse timestamp

date { match => ["timestamp", "ISO8601", "yyyy-MM-dd HH:mm:ss"] target => "@timestamp" }

Add environment tag

mutate { add_field => { "environment" => "production" } }

Grok pattern for nginx logs

if [type] == "nginx" { grok { match => { "message" => '%{IPORHOST:client_ip} - %{USER:user} [%{HTTPDATE:timestamp}] "%{WORD:method} %{URIPATHPARAM:request} HTTP/%{NUMBER:http_version}" %{NUMBER:status} %{NUMBER:bytes}' } } } }

output { elasticsearch { hosts => ["elasticsearch:9200"] index => "logs-%{+YYYY.MM.dd}" } }

Advanced Filtering

filter {

Parse application logs

grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} [%{DATA:service}] %{GREEDYDATA:log_message}" } }

Extract trace ID from message

if [log_message] =~ /trace_id=/ { grok { match => { "log_message" => "trace_id=%{UUID:trace_id}" } } }

GeoIP lookup

if [client_ip] { geoip { source => "client_ip" target => "geoip" } }

Drop debug logs in production

if [level] == "DEBUG" and [environment] == "production" { drop {} }

Enrich with lookup

translate { field => "status" destination => "status_description" dictionary => { "200" => "OK" "404" => "Not Found" "500" => "Internal Server Error" } } }

Filebeat Configuration

filebeat/filebeat.yml

filebeat.inputs:

type: container paths:
- '/var/lib/docker/containers//.log' processors:
- add_docker_metadata: host: "unix:///var/run/docker.sock"
type: log enabled: true paths:
- /var/log/nginx/*.log tags: ["nginx"] fields: type: nginx

output.logstash: hosts: ["logstash:5044"]

logging.level: info logging.to_files: true logging.files: path: /var/log/filebeat name: filebeat keepfiles: 7

Elasticsearch Queries

Basic Queries

// Search all logs GET logs-*/_search { "query": { "match_all": {} } }

// Search by keyword GET logs-*/_search { "query": { "match": { "message": "error" } } }

// Filter by field GET logs-*/_search { "query": { "bool": { "must": [ { "match": { "level": "ERROR" } }, { "range": { "@timestamp": { "gte": "now-1h" } } } ], "filter": [ { "term": { "service": "api-gateway" } } ] } } }

Aggregations

// Count by log level GET logs-*/_search { "size": 0, "aggs": { "log_levels": { "terms": { "field": "level" } } } }

// Error rate over time GET logs-*/_search { "size": 0, "aggs": { "errors_over_time": { "date_histogram": { "field": "@timestamp", "fixed_interval": "5m" }, "aggs": { "error_count": { "filter": { "term": { "level": "ERROR" } } } } } } }

Kibana Setup

Index Patterns

Go to Stack Management → Index Patterns
Create pattern: logs-*
Set time field: @timestamp

Saved Searches

Create saved searches for common queries:

level:ERROR
All errors
service:api-gateway AND level:ERROR
API gateway errors
response_time:>1000
Slow requests

Visualizations

Common visualization types:

Line Chart: Error rate over time
Pie Chart: Distribution by log level
Data Table: Top error messages
Metric: Total error count

Dashboard Example

Create dashboard with:

Total log count (Metric)
Error rate trend (Line chart)
Logs by service (Pie chart)
Recent errors (Data table)
Log stream (Discover panel)

Alerting

Watcher (X-Pack)

PUT _watcher/watch/error_alert { "trigger": { "schedule": { "interval": "5m" } }, "input": { "search": { "request": { "indices": ["logs-*"], "body": { "query": { "bool": { "must": [ { "match": { "level": "ERROR" } }, { "range": { "@timestamp": { "gte": "now-5m" } } } ] } } } } } }, "condition": { "compare": { "ctx.payload.hits.total.value": { "gt": 100 } } }, "actions": { "notify_slack": { "webhook": { "scheme": "https", "host": "hooks.slack.com", "port": 443, "method": "post", "path": "/services/xxx", "body": "{"text": "High error rate detected: {{ctx.payload.hits.total.value}} errors in last 5 minutes"}" } } } }

Common Issues

Issue: High Disk Usage

Problem: Elasticsearch consuming too much disk Solution: Implement ILM policies, reduce retention

Issue: Slow Searches

Problem: Queries taking too long Solution: Optimize index settings, add more shards, use filters

Issue: Log Parsing Failures

Problem: Logs not parsed correctly Solution: Test grok patterns, check for log format changes

Issue: Memory Pressure

Problem: Elasticsearch OOM errors Solution: Increase heap size (max 50% of RAM), limit field data

Best Practices

Implement index lifecycle management
Use index templates for consistent mappings
Parse logs at ingestion time
Limit stored fields to reduce storage
Use data streams for time-series data
Monitor cluster health
Implement proper security (X-Pack)
Regular index maintenance

Related Skills

loki-logging - Alternative logging stack
prometheus-grafana - Metrics monitoring
audit-logging - Compliance logging

elk-stack

Safety Notice

Copy this and send it to your AI assistant to learn