Consistent Hashing
See how distributed systems spread data across servers. This example shows a cache, but the same technique works for databases, load balancers, CDNs, etc.
Key Distribution
Add Key
Nodes
Nodes
3
Keys
6
Virtual Nodes
0
Avg Keys/Node
2.0
The Problem with Simple Hashing
The straightforward way to distribute data is modulo hashing: server = hash(key) % num_servers. This works until you add or remove a server.
Example: 3 servers → 4 servers
With 3 servers:
hash("user:1") % 3 = 2 → Server Chash("user:2") % 3 = 0 → Server Ahash("user:3") % 3 = 1 → Server BWith 4 servers:
hash("user:1") % 4 = 2 → Server C ✓hash("user:2") % 4 = 3 → Server D ✗hash("user:3") % 4 = 0 → Server A ✗Adding one server moved 2 out of 3 keys. At scale, ~75% of all cached data becomes invalid.
Consistent Hashing Solves This
| Scenario | Modulo Hash | Consistent Hash |
|---|---|---|
| Add 1 node to 10 | ~90% keys move | ~9% keys move |
| Remove 1 node from 10 | ~90% keys move | ~10% keys move |
| Double nodes (5 → 10) | ~100% keys move | ~50% keys move |
How the Hash Ring Works
Instead of using modulo, consistent hashing maps both servers and keys onto a circular space (the ring). Each gets a position based on its hash value.
- Hash servers to positions: Each server gets a position on the ring based on its identifier (e.g., IP address or name).
- Hash keys to positions: Each data key is also hashed to a position on the ring.
- Find the responsible server: Walk clockwise from the key's position until you hit a server. That server owns this key.
Try it above: add keys and watch them automatically find their server by moving clockwise.
Hash Functions in Practice
The hash function converts keys and server identifiers into positions on the ring. Different systems use different hash functions based on their needs.
This visualization
Uses a simple hash (djb2 variant) that converts strings to numbers 0-359. Fast and good enough for demonstration, but not cryptographically secure.
Production systems
- • Ketama – Memcached's consistent hashing implementation using MD5 with virtual nodes.
- • MurmurHash – Fast non-cryptographic hash used by Cassandra for partitioning.
What Happens When Nodes Change
Node Added
The new node takes ownership of keys between itself and the previous node (counter-clockwise). Only those keys move. All other keys stay put.
Node Removed
Keys that belonged to the removed node move to the next node clockwise. All other keys stay with their current server.
Try it above: click a node to remove it. Watch which keys change color (move to a new server) and which stay the same.
Virtual Nodes
With few physical servers, distribution can be uneven. One server might end up owning a large arc of the ring while another owns a tiny slice. Virtual nodes solve this by giving each server multiple positions on the ring.
Without virtual nodes: Server A has one position and owns all keys in its arc, which might be large or small depending on where other servers landed.
With 3 virtual nodes: Server A gets three positions spread around the ring. Now it owns three smaller arcs that together are closer to a fair share.
Benefits
- • More even key distribution
- • Smoother rebalancing when nodes join/leave
- • Can give powerful servers more virtual nodes
Trade-offs
- • More positions to store in memory
- • Slightly more lookup overhead
Toggle "Virtual Nodes" above to see how they spread each server across multiple positions.