Cluster Coordination (KRaft vs ZooKeeper)
At a Glance
| Aspect | Details |
|---|---|
| Topic | Cluster coordination, metadata management, controller election |
| Complexity | Intermediate |
| Prerequisites | Parts 1-3 (Architecture, Partitions, Fault Tolerance) |
| Time | 90 minutes |
| Kafka Version | 3.6+ (KRaft production-ready) |
What You'll Learn
After completing this article, you will be able to:
- Explain why ZooKeeper is being removed from Kafka's architecture
- Describe KRaft's controller quorum and how it handles metadata
- Configure a KRaft-based Kafka cluster for production
- Plan migration from ZooKeeper to KRaft mode
- Troubleshoot controller election and metadata propagation issues
Production Story: The ZooKeeper Session Timeout Storm
The Incident
It was Black Friday, and our e-commerce platform was handling 5x normal traffic. At 2:47 PM, alerts started firing: "Consumer lag increasing across all topics." Within minutes, the entire Kafka cluster became unresponsive.
The Investigation
BASH(5 lines)CodeLoading syntax highlighter...
The cluster had 15 brokers, 200+ consumers, and 50+ producers - all maintaining ZooKeeper sessions. Under extreme load:
- GC pauses on ZooKeeper nodes exceeded session timeout
- Session expirations triggered mass reconnections
- Reconnection storm overwhelmed ZooKeeper
- Broker disconnections caused controller failover
- Cascading failures across the entire cluster
Timeline of Chaos: 14:47:00 - ZK node 1: Long GC pause (8 seconds) 14:47:08 - 500+ sessions expire simultaneously 14:47:09 - Reconnection storm begins 14:47:15 - ZK node 2 overwhelmed, stops responding 14:47:20 - Controller broker loses ZK session 14:47:21 - Controller election starts 14:47:45 - New controller elected, but ZK still struggling 14:48:00 - Brokers can't update metadata 14:48:30 - Producers start timing out 14:49:00 - Full cluster unavailability
The Root Cause
ZooKeeper's architecture wasn't designed for Kafka's scale:
┌─────────────────────────────────────────────────────────┐ │ ZooKeeper Cluster │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ ZK-1 │ │ ZK-2 │ │ ZK-3 │ │ │ │ (Leader)│◄──►│(Follower│◄──►│(Follower│ │ │ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ │ │ │ └───────┼──────────────┼──────────────┼───────────────────┘ │ │ │ ▼ ▼ ▼ ┌─────────────────────────────────────────────┐ │ ALL connections go to ZK │ │ │ │ 15 Brokers × 1 connection = 15 │ │ 200 Consumers × 1 connection = 200 │ │ 50 Producers (old clients) = 50 │ │ Controller = 1 │ │ ───────────────────────────── │ │ Total: 266+ persistent connections │ │ + All their watches and ephemeral nodes │ └─────────────────────────────────────────────┘
The Fix (Short-term)
PROPERTIES(11 lines)CodeLoading syntax highlighter...
The Real Solution: KRaft Migration
We migrated to KRaft mode, eliminating ZooKeeper entirely. Result:
- No more session storms - clients don't connect to controllers
- Faster failover - controller election in milliseconds, not seconds
- Simplified operations - one system instead of two
- Better scalability - tested to millions of partitions
Mental Model: ZooKeeper vs KRaft Architecture
ZooKeeper Mode (Legacy)
┌─────────────────────────────────────────────────────────────┐ │ ZOOKEEPER MODE │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────────────┐ ┌──────────────────────┐ │ │ │ ZooKeeper Cluster │ │ Kafka Cluster │ │ │ │ ┌────┐ ┌────┐ ┌────┐│ │ ┌────┐ ┌────┐ ┌────┐ │ │ │ │ │ZK-1│ │ZK-2│ │ZK-3││ │ │ B1 │ │ B2 │ │ B3 │ │ │ │ │ └──┬─┘ └──┬─┘ └──┬─┘│ │ │ │ │CTRL│ │ │ │ │ │ │ │ │ │ │ │ └──┬─┘ └──┬─┘ └──┬─┘ │ │ │ │ └──────┼──────┘ │ │ │ │ │ │ │ │ │ │ │ │ └──────┼──────┘ │ │ │ └────────────┼─────────┘ └───────────┼──────────┘ │ │ │ │ │ │ └───────────┬───────────────┘ │ │ │ │ │ ZK Connection │ │ (All brokers connect to ZK) │ │ │ │ Metadata stored in: ZooKeeper znodes │ │ Controller election: Via ZK ephemeral node │ │ Broker registration: ZK ephemeral nodes │ │ Config changes: Written to ZK, brokers watch │ │ │ └─────────────────────────────────────────────────────────────┘
KRaft Mode (Modern)
┌─────────────────────────────────────────────────────────────┐ │ KRAFT MODE │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Kafka Cluster (Self-Managed) │ │ │ │ │ │ │ │ Controllers (Quorum) Brokers │ │ │ │ ┌─────────────────────┐ ┌──────────────────┐ │ │ │ │ │ ┌────┐ ┌────┐ ┌────┐│ │ ┌────┐ ┌────┐ │ │ │ │ │ │ │ C1 │ │ C2 │ │ C3 ││ │ │ B1 │ │ B2 │ │ │ │ │ │ │ │ACT │ │FLWR│ │FLWR││ │ │ │ │ │ │ │ │ │ │ │ └──┬─┘ └──┬─┘ └──┬─┘│ │ └──┬─┘ └──┬─┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └──────┼──────┘ │ │ └────┬────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └───────────┼─────────┘ └─────────┼────────┘ │ │ │ │ │ │ │ │ │ │ └────────────────────────┘ │ │ │ │ Metadata Push │ │ │ │ (Controllers push to brokers) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ Metadata stored in: __cluster_metadata topic (Raft log) │ │ Controller election: Raft consensus │ │ Broker registration: Metadata records │ │ Config changes: Replicated via Raft │ │ │ │ NO ZOOKEEPER NEEDED! │ └─────────────────────────────────────────────────────────────┘
Key Architectural Differences
┌────────────────────┬─────────────────────┬─────────────────────┐ │ Aspect │ ZooKeeper │ KRaft │ ├────────────────────┼─────────────────────┼─────────────────────┤ │ Metadata Storage │ ZK znodes │ __cluster_metadata │ │ │ (external system) │ (internal topic) │ ├────────────────────┼─────────────────────┼─────────────────────┤ │ Controller │ One active │ Quorum (3-5 nodes) │ │ Architecture │ (others standby) │ (Raft consensus) │ ├────────────────────┼─────────────────────┼─────────────────────┤ │ Failover Time │ Seconds to minutes │ Milliseconds │ │ │ (ZK session timeout)│ (Raft heartbeat) │ ├────────────────────┼─────────────────────┼─────────────────────┤ │ Scalability │ ~200K partitions │ Millions of │ │ │ (ZK is bottleneck) │ partitions │ ├────────────────────┼─────────────────────┼─────────────────────┤ │ Client Connections │ Clients → ZK │ Clients → Brokers │ │ │ (for old clients) │ (no ZK contact) │ ├────────────────────┼─────────────────────┼─────────────────────┤ │ Operational │ Two systems │ One system │ │ Complexity │ (ZK + Kafka) │ (Kafka only) │ └────────────────────┴─────────────────────┴─────────────────────┘
Deep Dive
1. What ZooKeeper Did for Kafka
Before understanding KRaft, let's appreciate what ZooKeeper handled:
ZooKeeper's Responsibilities in Kafka: 1. CONTROLLER ELECTION /controller → {"brokerid": 2, "timestamp": ...} (Ephemeral node - disappears when broker dies) 2. BROKER REGISTRATION /brokers/ids/1 → {"host": "broker1", "port": 9092, ...} /brokers/ids/2 → {"host": "broker2", "port": 9092, ...} (Ephemeral nodes for liveness detection) 3. TOPIC CONFIGURATION /brokers/topics/orders → {"partitions": {"0": [1,2,3], ...}} /config/topics/orders → {"retention.ms": "604800000"} 4. PARTITION LEADERSHIP /brokers/topics/orders/partitions/0/state → {"leader": 1, "isr": [1,2,3], "controller_epoch": 5} 5. ACLs AND QUOTAS /kafka-acl/Topic/orders → [acl entries] /config/users/alice → {"producer_byte_rate": "1000000"} 6. CONSUMER GROUP OFFSETS (Legacy) /consumers/my-group/offsets/orders/0 → "12345" (Modern Kafka uses __consumer_offsets topic instead)
Problems with ZooKeeper Dependency
JAVA(29 lines)CodeLoading syntax highlighter...
2. KRaft Architecture Deep Dive
KRaft (Kafka Raft) replaces ZooKeeper with a built-in consensus protocol:
┌───────────────────────────────────────────────────────────────┐ │ KRAFT CONTROLLER QUORUM │ ├───────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Controller 1│ │ Controller 2│ │ Controller 3│ │ │ │ (ACTIVE) │ │ (FOLLOWER) │ │ (FOLLOWER) │ │ │ │ │ │ │ │ │ │ │ │ Raft Log: │ │ Raft Log: │ │ Raft Log: │ │ │ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │ │ │ │Record 1│ │ │ │Record 1│ │ │ │Record 1│ │ │ │ │ │Record 2│ │ │ │Record 2│ │ │ │Record 2│ │ │ │ │ │Record 3│ │ │ │Record 3│ │ │ │Record 3│ │ │ │ │ │ ... │ │ │ │ ... │ │ │ │ ... │ │ │ │ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │ │ │ │ │ │ │ │ │ │ │ In-Memory │ │ In-Memory │ │ In-Memory │ │ │ │ Metadata │ │ Metadata │ │ Metadata │ │ │ │ Cache │ │ Cache │ │ Cache │ │ │ └─────┬───────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ │ Raft Replication │ │ │ └───────────────────┼──────────────────┘ │ │ │ │ │ ▼ │ │ __cluster_metadata topic │ │ (The Raft log, partitioned) │ │ │ └───────────────────────────────────────────────────────────────┘
Metadata Records in KRaft
JAVA(20 lines)CodeLoading syntax highlighter...
3. Controller Quorum Mechanics
RAFT CONSENSUS IN KRAFT: ┌─────────────────────────────────────────────────────────────┐ │ LEADER ELECTION │ ├─────────────────────────────────────────────────────────────┤ │ │ │ 1. Initial state: No leader │ │ ┌────┐ ┌────┐ ┌────┐ │ │ │ C1 │ │ C2 │ │ C3 │ All candidates │ │ └────┘ └────┘ └────┘ │ │ │ │ 2. Election timeout triggers (randomized) │ │ ┌────┐ ┌────┐ ┌────┐ │ │ │ C1 │──┼──┼──►│ C2 │ C1 times out first │ │ │CAND│ │ │ │ │ Requests votes │ │ └────┘ │ │ └────┘ │ │ │ ▼ │ │ │ ┌────┐ │ │ └►│ C3 │ │ │ └────┘ │ │ │ │ 3. Votes granted (majority needed) │ │ ┌────┐ ┌────┐ ┌────┐ │ │ │ C1 │◄─┤VOTE├──│ C2 │ C1 gets 2 votes │ │ │ │ └────┘ │ │ (self + C2) │ │ │ │◄─┤VOTE├──│ │ │ │ └────┘ └────┘ └────┘ │ │ ▲ │ │ │ └───────────────┘ │ │ │ │ 4. Leader established │ │ ┌────┐ ┌────┐ ┌────┐ │ │ │ C1 │ │ C2 │ │ C3 │ │ │ │LEAD│──►FLWR│ │FLWR│ C1 is leader │ │ └────┘ └────┘ └────┘ Sends heartbeats │ │ │ └─────────────────────────────────────────────────────────────┘ LOG REPLICATION: ┌─────────────────────────────────────────────────────────────┐ │ │ │ Leader (C1) Followers (C2, C3) │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ Log: │ │ Log: │ │ │ │ [1] TopicA │ ──────► │ [1] TopicA │ │ │ │ [2] Partition│ Append │ [2] Partition│ │ │ │ [3] Config │ Entries │ [3] Config │ │ │ │ [4] Leader │ ──────► │ [4] Leader │ │ │ └──────────────┘ └──────────────┘ │ │ │ │ Commit: Entry committed when majority acknowledges │ │ [1] ✓ (3/3) [2] ✓ (3/3) [3] ✓ (2/3) [4] ○ (1/3) │ │ │ └─────────────────────────────────────────────────────────────┘
4. KRaft Configuration
Controller-Only Nodes
PROPERTIES(21 lines)CodeLoading syntax highlighter...
Broker-Only Nodes
PROPERTIES(20 lines)CodeLoading syntax highlighter...
Combined Mode (Development)
PROPERTIES(16 lines)CodeLoading syntax highlighter...
5. Spring Kafka with KRaft
JAVA(43 lines)CodeLoading syntax highlighter...
6. Admin Operations in KRaft Mode
JAVA(86 lines)CodeLoading syntax highlighter...
7. Migration Path: ZooKeeper to KRaft
MIGRATION PHASES: ┌─────────────────────────────────────────────────────────────┐ │ Phase 1: PREPARATION │ ├─────────────────────────────────────────────────────────────┤ │ │ │ • Upgrade to Kafka 3.5+ (KRaft production-ready) │ │ • Ensure inter.broker.protocol.version = 3.5+ │ │ • Audit custom tooling for ZK dependencies │ │ • Plan controller node placement │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Phase 2: DEPLOY CONTROLLERS │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ ZK1 │ │ ZK2 │ │ ZK3 │ (Still active) │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ C1 │ │ C2 │ │ C3 │ (New KRaft │ │ │(standby)│ │(standby)│ │(standby)│ controllers) │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ Broker1 │ │ Broker2 │ │ Broker3 │ (Using ZK) │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Phase 3: MIGRATION MODE │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Run: kafka-metadata.sh snapshot --from zk --to kraft │ │ │ │ ┌─────────┐ ┌─────────────────────┐ │ │ │ ZK │ ──────► │ __cluster_metadata │ │ │ │ znodes │ Copy │ (KRaft) │ │ │ └─────────┘ └─────────────────────┘ │ │ │ │ Metadata migrated, both systems active temporarily │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Phase 4: DUAL-WRITE │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Brokers write to both ZK and KRaft controllers │ │ │ │ ┌─────────┐ │ │ │ Broker │ │ │ └────┬────┘ │ │ │ │ │ ┌───────┴───────┐ │ │ ▼ ▼ │ │ ┌─────────┐ ┌─────────┐ │ │ │ ZK │ │ KRaft │ │ │ │ │ │ │ │ │ └─────────┘ └─────────┘ │ │ │ │ Validate: Both have consistent state │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Phase 5: KRAFT ONLY │ ├─────────────────────────────────────────────────────────────┤ │ │ │ Run: kafka-metadata.sh finalize │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ ZK1 │ │ ZK2 │ │ ZK3 │ (Shutdown) │ │ │ STOP │ │ STOP │ │ STOP │ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ C1 │ │ C2 │ │ C3 │ (Active) │ │ │ ACTIVE │ │ FOLLWR │ │ FOLLWR │ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ ZooKeeper decommissioned! │ │ │ └─────────────────────────────────────────────────────────────┘
Migration Commands
BASH(31 lines)CodeLoading syntax highlighter...
8. Monitoring KRaft Controllers
JAVA(42 lines)CodeLoading syntax highlighter...
YAML(30 lines)CodeLoading syntax highlighter...
Common Mistakes
Mistake 1: Running Insufficient Controllers
PROPERTIES(12 lines)CodeLoading syntax highlighter...
Mistake 2: Same node.id Across Nodes
PROPERTIES(15 lines)CodeLoading syntax highlighter...
Mistake 3: Mixing ZK and KRaft Configurations
PROPERTIES(14 lines)CodeLoading syntax highlighter...
Mistake 4: Not Formatting Storage Before First Start
BASH(11 lines)CodeLoading syntax highlighter...
Mistake 5: Different Cluster IDs Across Nodes
BASH(16 lines)CodeLoading syntax highlighter...
Debug This
Scenario: Controller Not Becoming Active
- All controllers show "FOLLOWER" state
- No active controller in cluster
- Brokers cannot register
- Admin operations timeout
BASH(21 lines)CodeLoading syntax highlighter...
JAVA(35 lines)CodeLoading syntax highlighter...
- Verify network connectivity between all controllers
- Ensure all controllers have the same
cluster.id - Check that
controller.quorum.votersis identical on all nodes - Verify
node.idmatches the ID incontroller.quorum.voters - Check for port conflicts on controller listener port
- Review controller logs for specific error messages
Exercises
Exercise 1: Local KRaft Cluster
Set up a 3-controller, 3-broker KRaft cluster using Docker Compose:
YAML(56 lines)CodeLoading syntax highlighter...
Exercise 2: Controller Failover Test
With the cluster from Exercise 1:
- Identify the active controller
- Stop the active controller container
- Observe failover in logs
- Verify new leader is elected
- Restart the stopped controller
- Verify it rejoins as follower
Exercise 3: Quorum Monitoring
Write a Spring Boot application that:
- Connects to the KRaft cluster
- Periodically checks quorum status
- Alerts when:
- No active controller
- A voter is lagging
- Less than 3 voters available
Exercise 4: Metadata Inspection
kafka-metadata.sh tool:- Dump the current metadata log
- Identify different record types
- Find the record for a specific topic
- Analyze metadata for partition assignments
BASH(4 lines)CodeLoading syntax highlighter...
Exercise 5: Migration Planning
Given a ZooKeeper-based cluster with:
- 5 brokers
- 3 ZooKeeper nodes
- 500 topics, 10,000 partitions
Create a detailed migration plan including:
- Hardware requirements for KRaft controllers
- Migration timeline with rollback points
- Validation steps at each phase
- Monitoring during migration
Interview Questions
Q1: Why is Kafka moving from ZooKeeper to KRaft?
- One distributed system instead of two
- Single security model, monitoring stack, deployment process
- Fewer moving parts = fewer failure modes
- ZooKeeper becomes a bottleneck around 200K partitions (all metadata in memory)
- KRaft can handle millions of partitions
- Metadata changes propagate faster (push vs poll)
- ZK-based controller failover takes seconds (session timeout)
- KRaft failover takes milliseconds (Raft heartbeat)
- Brokers recover faster because metadata is pushed, not pulled
- ZK mode had inconsistency windows during metadata propagation
- KRaft provides stronger consistency guarantees
- Single source of truth in
__cluster_metadatatopic
- Built-in consensus protocol designed for Kafka's needs
- Event-sourced metadata (can replay log to recover)
- Better support for metadata snapshots and compaction
Q2: How does controller election work in KRaft?
- Leader heartbeat timeout (followers don't hear from leader)
- Initial cluster startup (no leader exists)
- Follower increments its term and transitions to candidate
- Candidate votes for itself and requests votes from other voters
- Each voter grants vote to first candidate in new term (first-come-first-served)
- Candidate becomes leader when it receives majority of votes
- New leader starts sending heartbeats to maintain leadership
- Randomized election timeout: Prevents split votes (candidates start elections at different times)
- Term numbers: Prevent stale leaders from causing confusion
- Majority requirement: Ensures only one leader per term
- Persistent vote: Voters remember who they voted for (survives restarts)
- Typical election time: 100-500ms
- Requires majority of voters (2/3, 3/5, etc.)
- No split-brain because only one candidate can get majority
Q3: What happens to clients during a controller failover in KRaft?
- Continue producing normally (producers talk to brokers, not controllers)
- May see brief retry if producing to partition that needs leader update
- Typically transparent (retries happen automatically)
- Continue consuming normally (consumers talk to brokers, not controllers)
- May see brief pause if fetching from partition needing leader update
- Offset commits unaffected (goes to
__consumer_offsetson brokers)
- Topic creation/deletion temporarily blocked during failover
- Config changes temporarily blocked
- Resume automatically once new controller is active
- Clients only interact with brokers, never directly with controllers
- Brokers cache metadata locally (serve clients from cache)
- Controller failover is fast (milliseconds)
- Brokers automatically refresh metadata from new controller
Q4: What's the __cluster_metadata topic and how is it different from regular topics?
__cluster_metadata is a special internal topic that stores all cluster metadata in KRaft mode:- Single partition (partition 0)
- Replicated across all controller nodes (not regular brokers)
- Uses Raft consensus for replication (not standard Kafka replication)
- Not accessible via normal producer/consumer APIs
- Broker registrations and fencing
- Topic and partition metadata
- Configuration changes
- ACLs and quotas
- Producer ID allocations
- Feature flags
| Aspect | Regular Topics | __cluster_metadata |
|---|---|---|
| Replication | ISR-based | Raft consensus |
| Producers | Any client | Only active controller |
| Consumers | Any client | Controllers only |
| Storage | Broker data dirs | Controller metadata dirs |
| Compaction | Optional | Always (implicit) |
| Access | Public API | Internal only |
- All changes are appended as records
- State can be reconstructed by replaying log
- Periodic snapshots for faster recovery
- Similar to event sourcing pattern in applications
Q5: How do you choose between combined mode and separate controller/broker roles?
Best for:
- Development and testing environments
- Small clusters (3-5 nodes)
- Resource-constrained deployments
- Simpler operations
Drawbacks:
- Controller and broker compete for resources
- GC pauses on broker affect controller
- Harder to scale controllers independently
Best for:
- Production environments
- Large clusters (10+ brokers)
- High-throughput workloads
- When controller stability is critical
Benefits:
- Dedicated resources for controllers
- Controllers isolated from broker load
- Can scale brokers without touching controllers
- Predictable controller performance
| Cluster Size | Recommendation |
|---|---|
| 1-3 nodes | Combined mode (dev only) |
| 3-5 nodes | Combined or separate |
| 5-10 nodes | Separate recommended |
| 10+ nodes | Separate required |
- CPU: Low (metadata operations are lightweight)
- Memory: 4-8GB (metadata in memory)
- Disk: SSD recommended (Raft log performance)
- Network: Low bandwidth, but low latency important
Summary
Key Takeaways
-
ZooKeeper was Kafka's original coordination service but became a bottleneck and operational burden at scale
-
KRaft replaces ZooKeeper with a built-in Raft-based consensus protocol, eliminating external dependencies
-
Controller quorum uses Raft consensus with 3-5 controller nodes for fault tolerance (odd numbers required)
-
__cluster_metadatatopic stores all cluster state as an event-sourced log, enabling fast recovery -
Migration is production-ready in Kafka 3.5+ with a well-defined path from ZooKeeper
-
Failover is faster in KRaft (milliseconds vs seconds) because it uses Raft heartbeats instead of ZK sessions
-
Clients are unaffected by controller failover because they only interact with brokers
-
Spring Kafka requires no changes for KRaft - just point to broker bootstrap servers
Quick Reference
Essential KRaft Configuration
PROPERTIES(11 lines)CodeLoading syntax highlighter...
Key Commands
BASH(14 lines)CodeLoading syntax highlighter...
ZK vs KRaft Quick Comparison
| ZooKeeper | KRaft | |
|---|---|---|
| External dependency | Yes (3-5 ZK nodes) | No |
| Max partitions | ~200K | Millions |
| Controller failover | 5-30 seconds | <1 second |
| Metadata consistency | Eventually | Strongly |
| Operational complexity | High | Low |
| Production ready | Yes | Yes (3.5+) |
Series Navigation
| Previous | Current | Next |
|---|---|---|
| Part 3: Fault Tolerance | Part 4: Cluster Coordination | Part 5: Producer Internals |
Series Overview
- Part 0: How to Use This Series
- Parts 1-4: Fundamentals (Architecture, Partitions, Fault Tolerance, Cluster Coordination)
- Parts 5-7: Producers
- Parts 8-11: Consumers
- Parts 12-14: Operations
- Parts 15-17: Kafka Streams
- Parts 18-20: Patterns & Practices
- Part 21: Cheatsheet & Decision Guide