Exactly-Once Semantics
At a Glance
| Aspect | Details |
|---|---|
| Topic | At-least/at-most/exactly-once, transactions, isolation levels |
| Complexity | Advanced |
| Prerequisites | Parts 8-10 (Consumer sections) |
| Time | 90 minutes |
| Spring Kafka | Transactional listeners, ChainedTransactionManager |
What You'll Learn
After completing this article, you will be able to:
- Distinguish between at-most-once, at-least-once, and exactly-once semantics
- Implement consume-transform-produce patterns with transactions
- Configure isolation levels for reading committed data only
- Build end-to-end exactly-once pipelines with Spring Kafka
- Understand when exactly-once is necessary vs overkill
Production Story: The Double-Charged Customers
The Incident
Our payment processing system had a subtle but devastating bug. Customers were being charged twice for single orders. The pattern was random - maybe 0.1% of transactions - but with 100,000 daily transactions, that meant 100 double charges per day. Customer trust was eroding fast.
The Investigation
JAVA(16 lines)CodeLoading syntax highlighter...
The timeline of a double-charge:
┌─────────────────────────────────────────────────────────────────────┐ │ THE DOUBLE-CHARGE SCENARIO │ ├─────────────────────────────────────────────────────────────────────┤ │ │ │ payment-requests topic Payment Processor payment-results │ │ │ │ T=0: Read payment request │ │ ┌──────────────┐ │ │ │ Order-123 │ ──────────────────► Process payment │ │ │ $100 │ │ │ └──────────────┘ │ │ │ │ T=1: Charge customer's card │ │ paymentGateway.charge() → SUCCESS ($100 charged) │ │ │ │ T=2: Send result to payment-results │ │ ┌──────────────┐ │ │ │ Order-123 │ ┌──────────────┐ │ │ │ SUCCESS │ ────────────────────►│ Order-123 │ │ │ └──────────────┘ │ SUCCESS │ │ │ └──────────────┘ │ │ │ │ T=3: NETWORK GLITCH! Producer send times out │ │ (but message actually reached broker) │ │ │ │ T=4: Exception thrown, no acknowledgment │ │ Consumer offset NOT committed │ │ │ │ T=5: Consumer restarts from last committed offset │ │ ┌──────────────┐ │ │ │ Order-123 │ ──────────────────► Process payment │ │ │ $100 │ AGAIN! │ │ └──────────────┘ │ │ │ │ T=6: paymentGateway.charge() → SUCCESS ($100 charged AGAIN!) │ │ │ │ RESULT: Customer charged $200 for $100 order │ │ Two SUCCESS messages in payment-results │ │ │ └─────────────────────────────────────────────────────────────────────┘
The Root Cause
The consume-transform-produce pattern has THREE operations that must be atomic:
- Process the input (charge customer)
- Write the output (payment result)
- Commit the input offset
Without transactions, these can fail independently, causing duplicates.
The Fix
JAVA(75 lines)CodeLoading syntax highlighter...
After the fix: Zero double charges.
Mental Model: Delivery Semantics
┌─────────────────────────────────────────────────────────────────────────┐ │ DELIVERY SEMANTICS COMPARISON │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ AT-MOST-ONCE (Fire and Forget) │ │ ─────────────────────────────── │ │ │ │ commit() → process() │ │ │ │ Producer: acks=0 (no acknowledgment) │ │ Consumer: Commit BEFORE processing │ │ │ │ Messages: [A] [B] [C] [D] [E] │ │ Delivered: A B - D E (C lost during failure) │ │ │ │ ✓ No duplicates ever │ │ ✗ May lose messages │ │ Use case: Metrics, logs where some loss acceptable │ │ │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ AT-LEAST-ONCE (Standard) │ │ ───────────────────────── │ │ │ │ process() → commit() │ │ │ │ Producer: acks=all, retries=MAX │ │ Consumer: Commit AFTER processing │ │ │ │ Messages: [A] [B] [C] [D] [E] │ │ Delivered: A B C C D E (C delivered twice on retry) │ │ │ │ ✗ May have duplicates │ │ ✓ Never loses messages │ │ Use case: Most applications (with idempotent processing) │ │ │ ├─────────────────────────────────────────────────────────────────────────┤ │ │ │ EXACTLY-ONCE │ │ ──────────── │ │ │ │ Transaction { process() + commit() } │ │ │ │ Producer: transactional.id, enable.idempotence=true │ │ Consumer: isolation.level=read_committed │ │ │ │ Messages: [A] [B] [C] [D] [E] │ │ Delivered: A B C D E (Each exactly once) │ │ │ │ ✓ No duplicates │ │ ✓ No message loss │ │ Use case: Financial, critical data pipelines │ │ │ └─────────────────────────────────────────────────────────────────────────┘
Transaction Flow
EXACTLY-ONCE TRANSACTION FLOW: ┌─────────────────────────────────────────────────────────────────────┐ │ Consumer Kafka Producer │ ├─────────────────────────────────────────────────────────────────────┤ │ │ │ 1. Poll messages │ │ ┌──────────┐ fetch ┌─────────────┐ │ │ │ Consumer │ ◄───────── │ input-topic │ │ │ └──────────┘ └─────────────┘ │ │ │ │ │ │ records │ │ ▼ │ │ 2. Begin transaction │ │ ┌──────────┐ ┌──────────┐ │ │ │ Consumer │ │ Producer │ │ │ │ │ │ beginTx()│ │ │ └──────────┘ └──────────┘ │ │ │ │ │ │ │ process │ │ │ ▼ │ │ │ 3. Process and produce output │ │ │ ┌──────────┐ ┌──────────┐ │ │ │ Process │ │ send() │ │ │ │ message │ ─────────────────────────────────► │ (in tx) │ │ │ └──────────┘ └────┬─────┘ │ │ │ │ │ ┌──────────────┐ │ │ │ │ output-topic │ ◄───────┘ │ │ │ (uncommitted)│ │ │ └──────────────┘ │ │ │ │ 4. Send offsets to transaction │ │ ┌──────────┐ ┌──────────┐ │ │ │ Consumer │ │ sendOff- │ │ │ │ offsets │ ─────────────────────────────────► │ sets │ │ │ └──────────┘ │ ToTx() │ │ │ └────┬─────┘ │ │ │ │ │ ┌────────────────┐ │ │ │ │__consumer_ │ ◄─────┘ │ │ │offsets │ │ │ │(uncommitted) │ │ │ └────────────────┘ │ │ │ │ 5. Commit transaction (atomic) │ │ ┌──────────┐ │ │ │ commit() │ │ │ └────┬─────┘ │ │ │ │ │ ┌───────────────────────────────────────────────┘ │ │ │ │ │ ▼ Atomic commit of: │ │ ┌──────────────┐ ┌────────────────┐ │ │ │ output-topic │ │__consumer_ │ │ │ │ COMMITTED │ │offsets │ │ │ └──────────────┘ │ COMMITTED │ │ │ └────────────────┘ │ │ │ │ All visible to downstream consumers simultaneously │ │ │ └─────────────────────────────────────────────────────────────────────┘
Deep Dive
1. Understanding Isolation Levels
JAVA(28 lines)CodeLoading syntax highlighter...
Isolation Level Behavior
ISOLATION LEVEL IMPACT: Partition state: ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ │ C │ C │ T1 │ T1 │ C │ T2 │ T2 │ C │ C │ N │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘ C = Committed (non-transactional or committed tx) T1 = Transaction 1 (uncommitted) T2 = Transaction 2 (uncommitted) N = Non-transactional Last Stable Offset (LSO) = 2 (first uncommitted tx) High Water Mark (HWM) = 10 read_uncommitted consumer sees: ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ │ ✓ │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘ Sees everything including uncommitted transactions Problem: If T1 aborts, consumer already processed those messages! read_committed consumer sees: ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐ │ 0 │ 1 │ 2 │ 3 │ 4 │ 5 │ 6 │ 7 │ 8 │ 9 │ │ ✓ │ ✓ │ - │ - │ - │ - │ - │ - │ - │ - │ └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘ Stops at LSO, waits for T1 to commit or abort After T1 commits: can read 2,3, but still blocked by T2 at 5 Key point: read_committed may have higher latency (waits for transactions to complete)
2. Consumer-Transform-Producer Pattern
JAVA(82 lines)CodeLoading syntax highlighter...
3. External Side Effects
JAVA(44 lines)CodeLoading syntax highlighter...
4. Chained Transaction Managers
JAVA(42 lines)CodeLoading syntax highlighter...
Chained Transaction Caveats
CHAINED TRANSACTION BEHAVIOR: Start: 1. Begin DB transaction 2. Begin Kafka transaction Commit: 1. Commit Kafka transaction FIRST 2. Commit DB transaction Rollback: 1. Rollback DB transaction 2. Rollback Kafka transaction PROBLEM SCENARIO: 1. Begin DB tx + Kafka tx 2. Do DB work 3. Do Kafka work 4. Commit Kafka tx - SUCCESS 5. Commit DB tx - FAILS (constraint violation) 6. Rollback DB tx 7. Kafka already committed! ← INCONSISTENT This is "pseudo-transactional" - not true two-phase commit For true consistency, use: - Outbox pattern (reliable) - Saga pattern (eventual consistency) - Accept occasional inconsistency with reconciliation
5. Performance Considerations
JAVA(45 lines)CodeLoading syntax highlighter...
6. When NOT to Use Exactly-Once
JAVA(37 lines)CodeLoading syntax highlighter...
Decision Framework
DO YOU NEED EXACTLY-ONCE? ┌──────────────────┐ │ Is data critical │ │ (financial, │ │ regulatory)? │ └────────┬─────────┘ │ ┌─────────────────┴─────────────────┐ │ YES │ NO ▼ ▼ ┌──────────────────────┐ ┌──────────────────────┐ │ Can processing be │ │ Can duplicates be │ │ made idempotent? │ │ tolerated? │ └──────────┬───────────┘ └──────────┬───────────┘ │ │ ┌─────────┴─────────┐ ┌─────────┴─────────┐ │ YES │ NO │ YES │ NO ▼ ▼ ▼ ▼ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │At-least- │ │Exactly- │ │At-least- │ │Exactly- │ │once + │ │once │ │once │ │once │ │idempotent│ │ │ │(simple) │ │ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ Most applications: At-least-once + idempotent processing Kafka Streams: Exactly-once built-in Financial/critical: Consider exactly-once
Common Mistakes
Mistake 1: Mixing Isolation Levels
JAVA(14 lines)CodeLoading syntax highlighter...
Mistake 2: Non-Unique Transaction IDs in Cluster
JAVA(13 lines)CodeLoading syntax highlighter...
Mistake 3: External Side Effects in Transaction
JAVA(19 lines)CodeLoading syntax highlighter...
Mistake 4: Long Transaction Duration
JAVA(27 lines)CodeLoading syntax highlighter...
Mistake 5: Assuming Exactly-Once Means No Duplicates Ever
JAVA(17 lines)CodeLoading syntax highlighter...
Debug This
Scenario: Transactions Timing Out
TransactionAbortedExceptionin logs- Messages not appearing in output topic
- Consumer offset not advancing
BASH(8 lines)CodeLoading syntax highlighter...
JAVA(15 lines)CodeLoading syntax highlighter...
- Processing too slow: Operations exceed transaction.timeout.ms
- Producer fenced: Another producer with same transactional.id
- Broker unavailable: Transaction coordinator can't be reached
- Memory pressure: GC pauses during transaction
JAVA(16 lines)CodeLoading syntax highlighter...
Exercises
Exercise 1: Implement Word Count with Exactly-Once
Create a Kafka Streams-style word count:
- Read sentences from input topic
- Split into words
- Aggregate counts
- Write to output topic
- Use exactly-once semantics
Exercise 2: Payment Processor with Idempotency
Build a payment processor that:
- Consumes payment requests
- Calls external payment API (mock)
- Produces payment results
- Handles retries without double-charging
- Uses exactly-once for Kafka side
Exercise 3: Transaction Monitoring Dashboard
Create monitoring for transactional producers:
- Track transaction durations
- Count commits vs aborts
- Alert on approaching timeout
- Visualize transaction state
Exercise 4: Compare Delivery Semantics
Build three versions of same processor:
- At-most-once (commit before process)
- At-least-once (commit after process)
- Exactly-once (transactional)
- Inject failures and compare behavior
Exercise 5: Outbox Pattern Implementation
Implement the outbox pattern:
- Write to DB and outbox table atomically
- Separate process reads outbox
- Publishes to Kafka
- Marks outbox entries as published
- Compare with direct transactional approach
Interview Questions
Q1: Explain the three delivery semantics in Kafka.
- Messages may be lost but never duplicated
- Commit offset before processing
- If processing fails after commit, message lost
- Use case: Logs, metrics where some loss acceptable
- Messages never lost but may be duplicated
- Commit offset after processing
- If crash after processing but before commit, reprocess
- Use case: Most applications (with idempotent handling)
- Each message processed exactly once
- Transactional produce + offset commit atomic
- Requires: transactional.id, read_committed isolation
- Use case: Financial, critical data pipelines
Most applications use at-least-once with idempotent processing because it's simpler and performs better than exactly-once.
Q2: How does Kafka achieve exactly-once semantics?
- Assigns Producer ID (PID) and sequence numbers
- Broker deduplicates retried messages
- Prevents duplicates from producer retries
- Groups multiple operations atomically
- Producer begins transaction, sends messages, commits/aborts
- All messages become visible at commit, or none at abort
- Consumer offset sent as part of producer transaction
sendOffsetsToTransaction()includes offset commit- Atomic: messages + offsets committed together
read_committedconsumers only see committed transactions- Uncommitted/aborted transaction messages invisible
- Prevents processing messages that will be rolled back
Q3: What's the performance impact of exactly-once?
- ~5-10ms per transaction for begin + commit
- Involves transaction coordinator communication
- Two-phase commit protocol
1 message/tx: 5ms overhead/msg 100 messages/tx: 0.05ms overhead/msg 1000 messages/tx: 0.005ms overhead/msg
- Consumer blocked until transactions commit
- Long-running transactions delay consumption
- May see higher end-to-end latency
- Non-transactional: ~1M messages/sec possible
- Transactional: ~100K-500K messages/sec typical
- Depends heavily on batch size and tx duration
- Financial/compliance data
- Low-to-medium throughput
- Strong consistency requirement
Q4: How do you handle external side effects with exactly-once?
JAVA(3 lines)CodeLoading syntax highlighter...
JAVA(6 lines)CodeLoading syntax highlighter...
JAVA(6 lines)CodeLoading syntax highlighter...
- Process at-least-once
- Run periodic reconciliation job
- Compare Kafka and external system state
Q5: When should you NOT use exactly-once semantics?
-
Processing is naturally idempotent:
- Cache updates (put(key, value))
- Status updates (setStatus(COMPLETED))
- Aggregations with unique keys
-
Data is non-critical:
- Metrics and monitoring
- Analytics events
- Log aggregation
-
High throughput required:
-
"
500K messages/sec
- Sub-millisecond latency requirements
-
-
External reconciliation exists:
- Daily batch reconciliation
- Source of truth elsewhere
- Eventual consistency acceptable
-
Duplicates handled downstream:
- Database unique constraints
- Deduplication service
- Consumer-side filtering
Summary
Key Takeaways
-
Three semantics: at-most-once (may lose), at-least-once (may duplicate), exactly-once (neither)
-
Exactly-once requires: transactional producer + read_committed consumer + offset in transaction
-
Consumer-transform-produce is the classic exactly-once pattern
-
External side effects can't be in Kafka transaction - use idempotency or outbox pattern
-
Transaction overhead is significant - batch messages to amortize
-
Isolation level must be
read_committedfor consumers in exactly-once pipelines -
Most applications should use at-least-once + idempotent processing
-
Chained transactions are not true two-phase commit - understand the limitations
Quick Reference
Exactly-Once Configuration
PROPERTIES(8 lines)CodeLoading syntax highlighter...
Spring Kafka Exactly-Once
JAVA(9 lines)CodeLoading syntax highlighter...
Delivery Semantics Summary
| Semantic | Producer Config | Consumer Config | Guarantee |
|---|---|---|---|
| At-most-once | acks=0 | Commit before process | May lose |
| At-least-once | acks=all, retries | Commit after process | May duplicate |
| Exactly-once | transactional.id | read_committed | Neither |
Series Navigation
| Previous | Current | Next |
|---|---|---|
| Part 10: Offset Management | Part 11: Exactly-Once | Part 12: Schema Registry |
Series Overview
- Part 0: How to Use This Series
- Parts 1-4: Fundamentals
- Parts 5-7: Producers
- Parts 8-11: Consumers (Internals, Groups, Offset Management, Exactly-Once)
- Parts 12-14: Operations
- Parts 15-17: Kafka Streams
- Parts 18-20: Patterns & Practices
- Part 21: Cheatsheet & Decision Guide