Consistent Hashing
📋 Quick Reference
| Property | Value |
|---|---|
| Type | Distributed hashing technique |
| Key Lookup | O(log n) with binary search on ring |
| Add Server | O(K/n) keys remapped (K=total keys, n=servers) |
| Remove Server | O(K/n) keys remapped |
| Best For | Distributed caches, load balancers, sharding |
🎮 Interactive Visualizer
Watch how consistent hashing distributes keys and handles server changes:
Loading visualizer...
- Add keys and see which server they map to
- Add a new server - observe only nearby keys move
- Remove a server - see keys redistribute to next server
- Compare with modulo hashing (all keys would move!)
🔧 How It Works
The Problem with Modulo Hashing
Traditional: server = hash(key) % num_servers With 3 servers: key "user:1" → hash=7 → 7 % 3 = 1 → Server 1 key "user:2" → hash=12 → 12 % 3 = 0 → Server 0 Add 4th server (now % 4): key "user:1" → 7 % 4 = 3 → Server 3 ← MOVED! key "user:2" → 12 % 4 = 0 → Server 0 ← stayed Problem: ~75% of keys move when adding 1 server!
Consistent Hashing Solution
Hash Ring (0 to 2^32): 0° | S1(30°) \ --- S2(120°) / S3(240°) Keys map to next server clockwise: hash("user:1") = 50° → Server S2 (next after 50°) hash("user:2") = 200° → Server S3 (next after 200°) Add server S4 at 180°: Only keys between 120° and 180° move to S4 Other keys stay where they are!
Virtual Nodes
Problem: Uneven distribution with few servers Solution: Each server gets multiple positions (virtual nodes) Server A: positions at 30°, 150°, 270° Server B: positions at 90°, 210°, 330° Benefits: - More even key distribution - Smoother rebalancing when servers change - Can weight servers (more vnodes = more keys)
📊 Key Redistribution Comparison
| Scenario | Modulo Hash | Consistent Hash |
|---|---|---|
| Add 1 server (3→4) | ~75% keys move | ~25% keys move |
| Add 1 server (10→11) | ~91% keys move | ~9% keys move |
| Remove 1 server | Similar high % | Only affected keys |
Formula
Keys that move when adding server to n existing: - Modulo hashing: ~(n/(n+1)) × 100% of keys - Consistent hashing: ~(1/(n+1)) × 100% of keys
✅ When to Use Consistent Hashing
Good Use Cases
- Distributed caches - Memcached, Redis Cluster
- Load balancers - sticky sessions
- Database sharding - distribute data across nodes
- CDN routing - route requests to nearest server
- P2P networks - DHT (Distributed Hash Tables)
Avoid When
- Single server - no distribution needed
- Perfect balance required - may need other algorithms
- Data locality matters - geographic sharding may be better
- Simple round-robin suffices - adds unnecessary complexity
🔄 Consistent Hashing Variants
| Variant | Feature | Used By |
|---|---|---|
| Basic Ring | Simple clockwise lookup | Early Dynamo |
| Virtual Nodes | Better distribution | Cassandra, Riak |
| Jump Hash | No ring, O(1) memory | |
| Rendezvous Hash | Highest random weight | Many systems |
| Maglev | Minimal disruption, fast | Google LB |
🧩 Implementation Patterns
Java Implementation
JAVA(42 lines)CodeLoading syntax highlighter...
Usage Example
JAVA(16 lines)CodeLoading syntax highlighter...
Redis Cluster
Redis Cluster uses hash slots (not consistent hashing ring): - 16384 hash slots - Each key maps to a slot: CRC16(key) % 16384 - Each node owns a range of slots Resharding moves slots between nodes Similar benefit: only affected slots' keys move
Memcached Client
JAVA(8 lines)CodeLoading syntax highlighter...
⚠️ Common Pitfalls
1. Too Few Virtual Nodes
JAVA(5 lines)CodeLoading syntax highlighter...
2. Poor Hash Function
JAVA(7 lines)CodeLoading syntax highlighter...
3. Not Handling Server Failure
JAVA(13 lines)CodeLoading syntax highlighter...
4. Ignoring Hot Spots
JAVA(9 lines)CodeLoading syntax highlighter...
🎤 Interview Tips
"When servers are added/removed in a distributed system, modulo hashing remaps most keys. Consistent hashing minimizes this: only K/n keys move (K=total keys, n=servers) instead of nearly all keys.
"Both servers and keys are hashed to positions on a ring (0 to 2^32). A key is assigned to the first server found clockwise from its position. Adding a server only affects keys between it and the previous server.
"Virtual nodes are multiple positions on the ring for each physical server. They improve load distribution (a server with more vnodes handles more keys) and smoother rebalancing. Typically 100-200 vnodes per server.
"Amazon DynamoDB, Apache Cassandra, Memcached (Ketama), Discord, Akamai CDN. Redis Cluster uses a similar concept with hash slots instead of a ring.
"Replicate keys to multiple servers (next N servers on the ring). When one fails, requests go to the next replica. This is how DynamoDB and Cassandra achieve high availability.
📚 Series Navigation
ConsistentHashingVisualizer from @tomaszjarosz/react-visualizers