Real-Time vs Batch Processing

Overview

This lesson explores two common patterns for interacting with Kafka: real-time processing and batch processing. Each has its strengths and challenges, particularly when dealing with retention windows and error handling.

Real-Time Processing

In real-time processing, producers send data to Kafka immediately as events occur, and consumers read this data almost simultaneously.

Use Cases

Fraud detection requiring immediate action
Real-time analytics and monitoring
Alert systems
Live dashboards
Transaction processing

Advantages

Immediate data availability
Quick response to events
Lower latency
Continuous data flow

Batch Processing

Batch processing involves collecting data over a specific interval and processing it all at once at a later time.

Key Considerations

Balance batch size with processing time
Ensure no data loss within retention window
Optimize for throughput over latency

Use Cases

ETL operations
Report generation
Bulk data transformations
Scheduled analytics

Bad Data: A Common Challenge

Bad data is a universal challenge for both processing modes, but each handles it differently.

Real-Time Processing Error Handling

Characteristics:

Data flows continuously from producer to consumer
Immediate action possible on bad data
Can retry, skip, or push to Dead Letter Queue (DLQ)
Flexible error handling without heavily impacting future data flow

Challenge:

If bad data isn't handled quickly, it can delay subsequent messages

Batch Processing Error Handling

Characteristics:

Data accumulates over an interval before processing
Error handling must fit within retention window
Delayed processing can cause message expiration

Challenge:

If retention is 1 hour and processing is delayed, messages might expire before being processed

Batch Processing Example: Payment System

Let's examine a concrete example with these parameters:

Transaction Rate: 1 transaction per second (TPS)
Retention Window: 1 hour
Batch Interval: 45 minutes

Calculation

Producer Side (45-minute interval):

Rate: 1 TPS
Total transactions: 45 min × 60 sec × 1 TPS = 2,700 transactions

Consumer Side (15-minute processing window):

Time available: 60 min retention - 45 min batch = 15 minutes
Transactions to process: 2,700
Required rate: 2,700 ÷ (15 × 60) = 3 TPS

| Time Interval | Rate | Transactions |

|---------------|------|--------------|

| Producer (0-45 mins) | 1 TPS | 2,700 |

| Consumer (15 mins) | 3 TPS | 2,700 |

Error Handling Challenges

Single Error Impact

When the consumer encounters bad data requiring 1 minute to handle:

Original time available: 15 minutes (900 seconds)
Time after error: 14 minutes (840 seconds)
Remaining transactions: 2,699

Multiple Errors Impact

Assuming 5% bad data rate:

Total transactions: 2,700
Bad transactions: 2,700 × 0.05 = 135 transactions
Error handling time: 135 transactions × 1 minute = 135 minutes

Critical Problem:

Error handling time (135 min) > Available window (15 min)
Error handling time (135 min) > Retention window (60 min)
Result: Data expires before it can be processed

This highlights the importance of:

Minimizing error rates
Optimizing error handling strategies
Proper data retention configuration

Solutions to Error Handling Challenges

Solution 1: Increase Data Retention

Approach:

Extend retention from 1 hour to 2+ hours
Provides more time to process all transactions, even with errors

Advantages:

Most reliable for critical data
Ensures no data loss
Handles error spikes

Trade-offs:

Requires more storage (increased costs)
Can slow down operations when cluster is busy
Higher infrastructure requirements

Best for: Critical data where loss is unacceptable

Solution 2: Reduce Error Handling Time

Approach:

Send bad transactions to Dead Letter Queue (DLQ)
Consumer skips errors and processes healthy transactions first
Failed data isolated for later analysis

Advantages:

Consumer remains efficient
Doesn't get stuck on errors
Failed data available for debugging
Smooth processing continues

Trade-offs:

Additional infrastructure required
DLQ needs monitoring
Complexity in error recovery process

Best for: Systems with frequent but manageable errors

Solution 3: Skip Bad Data

Approach:

Configure consumer to log and skip bad data entirely
Process only healthy messages
No retry or DLQ

Advantages:

Simplest implementation
Keeps system running without interruptions
No additional infrastructure

Trade-offs:

Potential data loss
Skipped data needs review later
Operational overhead for investigation

Best for: Systems where:

Errors are rare
Some data loss is tolerable
Simplicity is prioritized

Solution Comparison

|----------|--------------|--------------|------------|----------------|

Choosing the Right Solution

Consider these factors:

Data criticality: How important is every transaction?
Error frequency: How often do errors occur?
Budget constraints: What are the storage costs?
Operational capacity: Can you manage complex error handling?
System priorities: Throughput vs. reliability vs. cost?

Decision Matrix

High-value, critical data:

Solution 1 (Increase Retention) + Solution 2 (DLQ)

Moderate importance, manageable error rate:

Solution 2 (DLQ)

Low criticality, rare errors:

Solution 3 (Skip) with logging

Summary

Both real-time and batch processing have their place in Kafka architectures:

Real-Time Processing:

Immediate action on data
Flexible error handling
Lower latency
Ideal for time-sensitive operations

Batch Processing:

Efficient for bulk operations
Must carefully manage retention windows
Error handling impacts processing time
Requires thoughtful strategy for reliability

The key to successful batch processing is balancing:

Batch size and frequency
Retention configuration
Error handling strategy
Infrastructure costs
Data criticality

By understanding these trade-offs, you can design a Kafka-based system that meets your specific requirements for performance, reliability, and cost.

CentralMesh.io

3.4 Real-Time vs Batch Processing

Real-Time vs Batch Processing

Overview

Real-Time Processing

Use Cases

Advantages

Batch Processing

Key Considerations

Use Cases

Bad Data: A Common Challenge

Real-Time Processing Error Handling

Batch Processing Error Handling

Batch Processing Example: Payment System

Calculation

Error Handling Challenges

Single Error Impact

Multiple Errors Impact

Solutions to Error Handling Challenges

Solution 1: Increase Data Retention

Solution 2: Reduce Error Handling Time

Solution 3: Skip Bad Data

Solution Comparison

Choosing the Right Solution

Decision Matrix

Summary