Kafka's Distributed System Architecture

Overview

Kafka is designed as a distributed system that spreads data across multiple brokers, ensuring high availability and fault tolerance. The architecture consists of three main components: producers, brokers, and consumers, each playing a key role in handling massive amounts of data.

Basic Data Flow

Producers send data to brokers
Brokers replicate data to other brokers
Consumers read data from brokers
This replication is crucial for fault tolerance - if one broker goes down, others can still provide access to the data.

Partitions and Leaders

Understanding Brokers and Partitions

Kafka brokers are the core of its distributed system. Each broker is responsible for managing partitions of topics, ensuring redundancy and fault tolerance across the system.

Single Partition Example

Consider a Kafka cluster with three brokers and two topics, each with one partition:

Broker 1: Manages the "Purchases" topic as leader
Broker 2: Leads the "Notifications" topic
Broker 3: No partition assignments in this setup
Zookeeper: Coordinates the cluster and ensures smooth operation

In this architecture:

Purchases producers/consumers interact with Broker 1
Notifications producers/consumers interact with Broker 2
Each broker handles its designated topic independently

Multiple Partitions and Scalability

Kafka divides topics into partitions to achieve parallelism and scalability. Each partition is assigned a:

Leader broker: Handles all read and write requests
Follower brokers: Replicate the leader's data for redundancy

#### Partition Example: User-Based Distribution

For a purchases topic split into two partitions:

Partition 0: Handles purchases from users with odd numbers (1, 3, 5...)
Partition 1: Manages purchases from users with even numbers (2, 4, 6...)

This approach maintains message order within each partition while distributing workload across multiple brokers, allowing concurrent processing.

Multi-Partition Cluster Configuration

In a cluster with multiple partitions:

Broker 1:

Partition 1 Leader
Partition 2 Follower

Broker 2:

Partition 1 Follower
Partition 2 Leader

Broker 3:

No partitions assigned

#### Data Flow for Partitions

Producers send data to the leader of each partition
Leaders replicate data to their followers
Consumers read only from leaders, ensuring consistency
Each partition operates independently for parallel processing

Replication and Fault Tolerance

To ensure data durability, Kafka uses replication. Each partition's leader replicates data to its followers, which can take over if the leader fails.

Failover Process

#### Normal Operation

Broker 1 is active, serving as leader for Partition 1
Broker 2 maintains a follower replica for fault tolerance
Producer sends data to the leader on Broker 1
Consumer reads from the same leader

#### When Broker 1 Fails

Failure Detection: Kafka detects Broker 1 is down
Leader Election: Zookeeper automatically promotes Broker 2's follower to leader
Traffic Redirection: Both producer and consumer now interact with the new leader on Broker 2
Continuity: Service continues without interruption
This automatic failover ensures:
- High availability
- Data consistency
- No data loss
- Seamless recovery
Multi-Partition Failover
When a broker fails in a multi-partition setup:
Before Failure:
- Partition 1 Leader on Broker 1
- Partition 2 Leader on Broker 2
- Followers on alternate brokers
After Broker 1 Failure:
- Partition 1 follower on Broker 2 becomes new leader
- Partition 2 continues normal operation
- Clients automatically redirect to new leaders
- No service disruption occurs

Key Architecture Benefits

1. High Availability

Multiple replicas ensure data access even during failures
Automatic failover maintains service continuity

2. Scalability

Partitions enable parallel processing
Workload distributes across brokers
Easy to add more brokers for capacity

3. Fault Tolerance

Data replication prevents data loss
Leader election ensures business continuity
Redundancy built into the architecture

4. Performance

Parallel processing through partitions
Load balancing across brokers
Optimized for high throughput

Summary

Kafka's distributed architecture provides:

Producers that send data to partition leaders
Brokers that manage partitions and replicate data
Consumers that read from partition leaders
Zookeeper/KRaft that coordinates the cluster
Partitions that enable parallelism and scalability
Replication that ensures fault tolerance and high availability

This design allows Kafka to handle massive data volumes while maintaining reliability, making it ideal for mission-critical streaming applications.

CentralMesh.io

3.1 Kafka Architecture