System DesignAdvancedarticle

System Design: Designing an Ad Click Aggregator

How does Google or Facebook aggregate billions of ad clicks for billing? A technical deep dive into Write-Heavy Scaling, Exactly-Once Processing, and Real-time Aggregation.

Sachin Sarawgi•April 20, 2026•3 min read•3 minute lesson

#system-design #ad-aggregator #analytics #write-heavy #kafka #flink #scalability

On This PageOpen

1. Core Requirements
2. The Data Path
3. Dealing with "Exactly-Once" Semantics
4. Scaling the Write Volume
5. Fraud Detection
Summary

System Design: Designing an Ad Click Aggregator

Ad click aggregation is a massive scale data problem. When billions of users click on ads across the web, those clicks must be aggregated, deduplicated, and stored for both real-time analytics (advertiser dashboards) and accurate billing.

1. Core Requirements

High Throughput: Handling billions of clicks per day (tens of thousands per second).
Accuracy: Billing requires exactly-once processing. We cannot charge an advertiser twice for the same click (Deduplication).
Latency: Real-time dashboards should update within seconds.
Resilience: No click should ever be lost, even if a data center goes down.

2. The Data Path

Click Event: A user clicks an ad. The browser sends a request to our Click Tracking Server.
Raw Log Ingestion: The Tracking Server immediately pushes the raw click event into Apache Kafka.
- Why Kafka? It acts as a high-performance buffer and persistent log.
Aggregation Engine: Apache Flink or Spark Streaming consumes from Kafka.
- Deduplication: Uses a Redis cache or a stateful Flink map to filter out duplicate clicks (based on click_id and user_id).
- Windowing: Clicks are aggregated in 1-minute tumbling windows.
Storage:
- Real-time: Aggregated counts are stored in Cassandra or Redis for the advertiser dashboard.
- Historical: Raw clicks are stored in Amazon S3 (Parquet) for long-term auditing and fraud detection.

3. Dealing with "Exactly-Once" Semantics

Billing systems cannot tolerate duplicates.

Kafka Idempotency: Producers are configured with enable.idempotence=true.
Checkpointing: Flink uses distributed snapshots (checkpoints) to ensure that if a worker fails, it resumes from the exact point in the log where it left off, ensuring no event is processed twice or missed.

4. Scaling the Write Volume

The biggest bottleneck is the write volume to the database.

Pre-aggregation: Never write every single click to the database. Aggregate them in RAM (in Flink) and write only the summary (e.g., "Ad 123 got 500 clicks in the last minute") once to the database.
Sharding: Shard the database by ad_id to distribute the aggregation load across multiple nodes.

5. Fraud Detection

Ad fraud (bots clicking ads) is a major concern.

Real-time Filter: Use ML models or rule-based filters (e.g., "more than 10 clicks from same IP in 1 second") to flag and filter fraudulent clicks before they reach the billing layer.

Summary

The engineering of an ad click aggregator is a battle of Write Throughput and Data Integrity. By using Kafka for ingestion and a stateful stream processor like Flink for pre-aggregation and deduplication, you can build a system that processes billions of events with perfect accuracy and sub-second latency.

📚

Recommended Resources

Designing Data-Intensive ApplicationsBest Seller

The definitive guide to building scalable, reliable distributed systems by Martin Kleppmann.

View on Amazon →

Kafka: The Definitive GuideEditor's Pick

Real-time data and stream processing by Confluent engineers.

View on Amazon →

Apache Kafka Series on Udemy

Hands-on Kafka course covering producers, consumers, Kafka Streams, and Connect.

View Course →

Practical engineering notes

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

LinkedIn GitHub Medium More articles

Share this lesson

Share on X Share on LinkedIn

Keep Learning

Move through the archive without losing the thread.

System Design: Designing Airbnb (Hotel/Home Booking)

System Design: Designing Airbnb (Hotel/Home Booking) Designing a platform like Airbnb or Booking.com involves two distinct technical challenges: Search (helping users find the perfect place) and Concurrency (ensuring tha…

System Design3 min readAdvanced

Windowing in Stream Processing: Tumbling, Sliding, and Session Windows

Windowing in Stream Processing: Timing is Everything In stream processing (Kafka Streams, Flink, Spark Streaming), you rarely want to aggregate data from the beginning of time. Instead, you want to perform calculations o…

Data Engineering2 min readAdvanced

More deep dives chosen from shared tags, category overlap, and reading difficulty.

System DesignAdvanced

System Design: Designing a Distributed Logging System (TB/Day Scale)

System Design: Designing a Distributed Logging System In a microservices architecture with thousands of containers, logs are scattered everywhere. You need a centralized system that can ingest terabytes of log data every…

Apr 20, 20263 min read

Deep Dive

#system-design#logging#elk-stack

System DesignAdvanced

System Design: Designing a Distributed Message Queue (Kafka Architecture)

System Design: Designing a Distributed Message Queue A Distributed Message Queue is the backbone of modern asynchronous architecture. It allows services to communicate without being tightly coupled. While many use Apache…

Apr 20, 20263 min read

Deep Dive

#system-design#kafka#message-queue

System DesignAdvanced

System Design: Designing a Real-Time Analytics Dashboard

System Design: Designing a Real-Time Analytics Dashboard Real-time analytics dashboards (used for tracking game players, ad clicks, or server metrics) require capturing and visualizing massive data streams. The challenge…

Apr 20, 20262 min read

Deep Dive

#system-design#analytics#real-time

System DesignAdvanced

System Design: Designing a Distributed Task Scheduler

System Design Masterclass: Designing a Distributed Task Scheduler Every backend engineer has written a cron job. It's simple: you put a script on a Linux server and tell the OS to run it every night at midnight. But what…

Apr 20, 20266 min read

Case StudyBackend Systems Mastery

#system-design#task-scheduler#cron

More in System Design

Category-based suggestions if you want to stay in the same domain.

System DesignIntermediate

System Design: Designing Stateless Authentication

System Design: Designing Stateless Authentication In a microservices architecture, you can't rely on server-side sessions (stored in memory/database) because every request might hit a different service instance. Stateles…

Apr 22, 20263 min read

Deep DiveBackend Systems Mastery

#system design#authentication#jwt

System DesignBeginner

gRPC vs REST: The Decision-Maker's Guide for Backend Architecture

gRPC vs REST: Which One for Your Microservices? In modern backend architecture, how services talk is as important as what they say. Choosing between REST and gRPC isn't just about syntax; it's about the trade-off between…

Apr 20, 20262 min read

ComparisonBackend Systems Mastery

#grpc#rest#api-design

System DesignBeginner

gRPC vs REST: A Decision-Maker's Guide for Backend Architecture

gRPC vs REST: Which One for Your Microservices? > Prerequisite: Before diving into protocols, ensure you understand the fundamentals of Load Balancing and API Idempotency. Choosing between REST and gRPC is one of the mos…

Apr 20, 20262 min read

ComparisonBackend Systems Mastery

#grpc#rest#api-design

← Back to all articles

System Design: Designing an Ad Click Aggregator

System Design: Designing an Ad Click Aggregator

1. Core Requirements

2. The Data Path

3. Dealing with "Exactly-Once" Semantics

4. Scaling the Write Volume

5. Fraud Detection

Summary

Recommended Resources

Get the next backend guide in your inbox

Sachin Sarawgi

Keep Learning

System Design: Designing Airbnb (Hotel/Home Booking)

Windowing in Stream Processing: Tumbling, Sliding, and Session Windows

Related Articles

System Design: Designing a Distributed Logging System (TB/Day Scale)

System Design: Designing a Distributed Message Queue (Kafka Architecture)

System Design: Designing a Real-Time Analytics Dashboard

System Design: Designing a Distributed Task Scheduler

More in System Design

System Design: Designing Stateless Authentication

gRPC vs REST: The Decision-Maker's Guide for Backend Architecture

gRPC vs REST: A Decision-Maker's Guide for Backend Architecture