5/8
Stream Processing & Real-Time Data Β· Page 1 of 1

Processing Streams with Windows

Stream Processing

The Stream Problem

Data arrives continuously:
sensor β†’ stream β†’ processing β†’ output

Can't store all data (infinite)
Need constant memory

Sliding Window for Streams

Process data in windows as it arrives:

Incoming: [1, 2, 3, 4, 5, 6, 7, 8, ...]

Window 1: [1, 2, 3] β†’ Process β†’ Output avg
Window 2: [2, 3, 4] β†’ Process β†’ Output avg
Window 3: [3, 4, 5] β†’ Process β†’ Output avg
...

Only keep window in memory!

Use Cases

1. Moving average (stock prices)
2. Traffic analysis (packets/sec)
3. Anomaly detection
4. Real-time metrics
5. Time-series analysis

Implementation Pattern

from collections import deque

class StreamProcessor:
    def __init__(self, window_size):
        self.window = deque(maxlen=window_size)
    
    def process_item(self, item):
        self.window.append(item)  # Auto-removes oldest if full
        
        if len(self.window) == window_size:
            return self.compute_metric()
    
    def compute_metric(self):
        return sum(self.window) / len(self.window)

Time-Based Windows

Event-time: When did event occur?
Processing-time: When received?

Example:
Event at 1:00 but received at 1:05
Include in 1:00 window or 1:05?

Complex Window Operations

Aggregation: sum, avg, min, max
Stateful: Count distinct, percentiles
Multiple windows: Overlapping windows
Triggers: Emit on size, time, or condition
main.py
Loading...
OUTPUT
β–ΆClick "Run Code" to execute…