stream-processing

Patterns and technologies for real-time data processing, event streaming, and stream analytics.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "stream-processing" with this command: npx skills add melodic-software/claude-code-plugins/melodic-software-claude-code-plugins-stream-processing

Stream Processing

Patterns and technologies for real-time data processing, event streaming, and stream analytics.

When to Use This Skill

  • Designing real-time data pipelines

  • Choosing stream processing frameworks

  • Implementing event-driven architectures

  • Building real-time analytics

  • Understanding streaming vs batch trade-offs

Batch vs Streaming

Comparison

Aspect Batch Streaming

Latency Minutes to hours Milliseconds to seconds

Data Bounded (finite) Unbounded (infinite)

Processing Process all at once Process as it arrives

State Recompute each run Maintain continuously

Complexity Lower Higher

Cost Often lower Often higher

When to Use Streaming

Use streaming when:

  • Real-time responses required (<1 minute)
  • Events need immediate action (fraud, alerts)
  • Data arrives continuously
  • Users expect live updates
  • Time-sensitive business decisions

Use batch when:

  • Daily/hourly reports sufficient
  • Complex transformations needed
  • Cost optimization priority
  • Historical analysis
  • One-time processing

Stream Processing Concepts

Event Time vs Processing Time

Event Time: When event actually occurred Processing Time: When event is processed

Example: ┌─────────────────────────────────────────────────────────┐ │ Event: Purchase at 10:00:00 (event time) │ │ Network delay: 5 seconds │ │ Processing: 10:00:05 (processing time) │ └─────────────────────────────────────────────────────────┘

Why it matters:

  • Late events need handling
  • Ordering not guaranteed
  • Watermarks track progress

Watermarks

Watermark = "All events before this time have arrived"

Event stream: ──[10:01]──[10:02]──[10:00]──[10:03]──[Watermark: 10:00]──

Allows system to:

  • Know when window is complete
  • Handle late events
  • Balance latency vs completeness

Windows

Tumbling Window (fixed, non-overlapping): |─────|─────|─────| 0 5 10 15 (seconds)

Sliding Window (fixed, overlapping): |─────| |─────| |─────| Size: 5s, Slide: 2s

Session Window (activity-based): |──────| |───────────| |───| User activity with gaps defines windows

Count Window: Process every N events

State Management

Stateful operations require maintained state:

  • Aggregations (sum, count, avg)
  • Joins between streams
  • Pattern detection
  • Deduplication

State backends:

  • In-memory (fast, limited)
  • RocksDB (larger, persistent)
  • External (Redis, database)

Stream Processing Frameworks

Apache Kafka Streams

Characteristics:

  • Library (not a cluster)
  • Exactly-once semantics
  • Kafka-native
  • Java/Scala

Best for:

  • Kafka-centric architectures
  • Simpler transformations
  • Microservices

Example topology: source → filter → map → aggregate → sink

Apache Flink

Characteristics:

  • Distributed cluster
  • True streaming (not micro-batch)
  • Advanced state management
  • SQL support

Best for:

  • Complex event processing
  • Large-scale streaming
  • Low-latency requirements

Example: DataStream<Event> events = env.addSource(kafkaSource); events .keyBy(e -> e.getUserId()) .window(TumblingEventTimeWindows.of(Time.minutes(5))) .aggregate(new CountAggregator()) .addSink(sink);

Apache Spark Streaming

Characteristics:

  • Micro-batch processing
  • Unified batch + streaming API
  • Wide ecosystem
  • Python, Scala, Java, R

Best for:

  • Teams with Spark experience
  • Batch + streaming unified
  • Machine learning integration

Latency: Seconds (micro-batch)

Kafka Streams vs Flink vs Spark

Factor Kafka Streams Flink Spark Streaming

Deployment Library Cluster Cluster

Latency Low Lowest Medium

State Good Excellent Good

Exactly-once Yes Yes Yes

Complexity Low High Medium

Scaling With Kafka Independent Independent

SQL Limited Yes Yes

ML integration Limited Limited Excellent

Stream Processing Patterns

Filtering

Input: All events Output: Events matching criteria

Example: Only process events where amount > 1000

Mapping/Transformation

Input: Event type A Output: Event type B

Example: Enrich order events with customer data

Aggregation

Input: Multiple events Output: Single aggregated result

Examples:

  • Count events per window
  • Sum amounts per user
  • Average latency per endpoint

Join Patterns

Stream-Stream Join: ┌─────────────┐ ┌─────────────┐ │ Orders │ ──► │ Join │ └─────────────┘ │ (by order_id│ ┌─────────────┐ │ in window) │ │ Shipments │ ──► │ │ └─────────────┘ └─────────────┘

Stream-Table Join (Enrichment): ┌─────────────┐ ┌─────────────┐ │ Events │ ──► │ Join │ └─────────────┘ │ (lookup by │ ┌─────────────┐ │ customer) │ │ Customer │ ──► │ │ │ Table │ └─────────────┘ └─────────────┘

Deduplication

Problem: Duplicate events from at-least-once delivery

Solution:

  1. Track seen IDs in state (with TTL)
  2. If seen, drop
  3. If new, process and store ID

State: {event_id: timestamp} TTL: Based on expected duplicate window

Event Delivery Guarantees

At-Most-Once

May lose events, never duplicates Process → Commit → (if fail, event lost)

Use when: Loss acceptable, simplicity preferred

At-Least-Once

Never loses, may have duplicates Commit → Process → (if fail, reprocess)

Use when: No loss acceptable, handle duplicates downstream

Exactly-Once

Never loses, never duplicates Requires:

  • Idempotent operations, OR
  • Transactional processing

How it works:

  1. Read from source transactionally
  2. Process and update state
  3. Write output and commit together

Flink: Checkpointing + two-phase commit Kafka Streams: Transactional producer + EOS

Late Event Handling

Strategies

  1. Drop late events Simple, may lose data

  2. Allow late events (allowed lateness) Process if within lateness threshold

  3. Side output late events Main stream processes on-time Side stream handles late separately

  4. Reprocess historical Batch job fixes late data impact

Watermark Strategies

Bounded Out-of-Orderness: watermark = max_event_time - max_lateness

Example: max_event_time = 10:00:00 max_lateness = 5 seconds watermark = 09:59:55

Events before 09:59:55 considered complete

Scalability Patterns

Partitioning

Partition by key for parallel processing:

┌─────────────────────────────────────────────────────┐ │ Kafka Topic (3 partitions) │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐│ │ │ Partition 0 │ │ Partition 1 │ │ Partition 2 ││ │ │ user_a, b │ │ user_c, d │ │ user_e, f ││ │ └─────────────┘ └─────────────┘ └─────────────────┘│ └─────────────────────────────────────────────────────┘ │ │ │ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌─────────┐ │Worker 0 │ │Worker 1 │ │Worker 2 │ └─────────┘ └─────────┘ └─────────┘

Backpressure

When downstream can't keep up:

  1. Buffer (risk: OOM)
  2. Drop (risk: data loss)
  3. Backpressure (slow down source)

Flink: Backpressure propagates automatically Kafka: Consumer lag indicates backpressure

Monitoring Streaming Applications

Key Metrics

Throughput:

  • Events per second
  • Bytes per second

Latency:

  • Processing latency
  • End-to-end latency

Health:

  • Consumer lag
  • Checkpoint duration
  • Backpressure rate
  • Error rate

Consumer Lag

Lag = Latest offset - Consumer offset

High lag indicates:

  • Processing too slow
  • Need more parallelism
  • Downstream bottleneck

Monitor: Set alerting thresholds

Best Practices

  1. Design for exactly-once when needed
  2. Handle late events explicitly
  3. Use event time, not processing time
  4. Monitor consumer lag closely
  5. Plan for state recovery
  6. Test with realistic data volumes
  7. Implement backpressure handling
  8. Keep processing idempotent when possible

Related Skills

  • message-queues

  • Messaging patterns

  • data-architecture

  • Data platform design

  • etl-elt-patterns

  • Data pipeline patterns

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

design-thinking

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

plantuml-syntax

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

system-prompt-engineering

No summary provided by upstream source.

Repository SourceNeeds Review