Data EngineeringAdvancedarticle

Windowing in Stream Processing: Tumbling, Sliding, and Session Windows

Master the art of real-time analytics. Learn the technical differences between Tumbling, Sliding, and Session windows in Kafka Streams and Flink.

Sachin SarawgiApril 20, 20262 min read2 minute lesson

Windowing in Stream Processing: Timing is Everything

In stream processing (Kafka Streams, Flink, Spark Streaming), you rarely want to aggregate data from the beginning of time. Instead, you want to perform calculations over specific time intervals. This is achieved through Windowing.

1. Tumbling Windows (Fixed-Size, No Overlap)

A Tumbling Window is a fixed-size, non-overlapping, and contiguous time interval.

  • How it works: If you have a 1-minute tumbling window, data from 12:00:00 to 12:00:59 lands in one bucket, and 12:01:00 starts a completely new bucket.
  • Best for: Reporting hourly sales, counting daily active users, or any discrete periodic reporting.

2. Sliding Windows (Overlapping)

A Sliding Window has a fixed size but "slides" forward by a smaller increment (the "slide").

  • How it works: A 1-hour window with a 5-minute slide. You'll get a result every 5 minutes covering the last 60 minutes of data.
  • Best for: Calculating a "Moving Average" of stock prices or detecting a spike in error rates over the last 15 minutes updated every minute.

3. Session Windows (Activity-Based)

Unlike fixed windows, Session Windows are defined by periods of activity followed by periods of inactivity (the "gap").

  • How it works: A session window starts when a user event arrives. It stays open as long as new events arrive within the "session gap" (e.g., 30 minutes). If no events arrive for 30 minutes, the window closes.
  • Best for: Website session analysis (tracking what a user does in one visit) or grouping together related sensor readings.

4. Hopping Windows (Alias for Sliding)

In some systems (like Kafka Streams), Sliding Windows are called Hopping Windows when the "hop" (slide) is larger than the window size, though this is rare in practice.

5. Handling Late Data: Watermarks

In a distributed system, data can arrive out of order. Watermarks tell the stream processor how long to wait for late-arriving data before "closing" a window and emitting the final result.

Summary

Choosing the right window type is critical for the accuracy of your real-time analytics. Use Tumbling for discrete periods, Sliding for continuous monitoring, and Session for user-centric activity tracking.

📚

Recommended Resources

Designing Data-Intensive ApplicationsBest Seller

The definitive guide to building scalable, reliable distributed systems by Martin Kleppmann.

View on Amazon
Kafka: The Definitive GuideEditor's Pick

Real-time data and stream processing by Confluent engineers.

View on Amazon
Apache Kafka Series on Udemy

Hands-on Kafka course covering producers, consumers, Kafka Streams, and Connect.

View Course

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

More in Data Engineering

Category-based suggestions if you want to stay in the same domain.