Hot Key / Big Key - Detection & Mitigation
A hot key is a key that receives a very high read or write rate; a big key is a key whose value (or aggregate size in Hash/List/Set/ZSet) is very large. Both can cause high latency, uneven load, and even OOM or node failure. This article explains how to detect and mitigate them.
Overview
- Hot key: One (or few) keys get most of the traffic. A single Redis node or single shard becomes the bottleneck; latency spikes. Causes: popular item cache, counter, or lock.
- Big key: One key holds a huge value (e.g. 10 MB string) or many elements (e.g. list with 1M elements). Single operations are slow (network, serialization), and DEL or expiry can block.
- Detection: Use Redis commands (e.g.
--bigkeys), monitor latency and key access (client-side or proxy), or use Redis’s latency and slow-log features. APM or proxy can show which keys are hit most. - Mitigation: For hot keys: replicate read (local cache, multiple replicas), or split the key (e.g. shard by client or time window). For big keys: split into multiple keys, compress, or avoid storing huge values in one key.
Example
Example 1: Hot key — single product cache
Plain textKey product:123 gets 100k QPS → one shard or node is overloaded.
- Mitigation:
- Local cache in app (e.g. Caffeine) so only a fraction of requests hit Redis.
- Or split key:
product:123:replica_0…product:123:replica_N, choose byhash(123) % Nor random; reduces QPS per key. - Or use a read replica and send reads to replicas (if topology allows).
Example 2: Big key — huge list
RedisLRANGE big:list 0 -1 -- 1M elements, blocks and transfers huge reply DEL big:list -- can block for a long time
- Mitigation:
- Don’t store 1M elements in one list. Shard by range or ID (e.g.
big:list:0,big:list:1). - Delete in chunks (e.g. LTRIM or pipeline DEL of sub-keys) to avoid long blocking.
- Or use a different store (e.g. DB or object store) for large collections.
- Don’t store 1M elements in one list. Shard by range or ID (e.g.
Example 3: Finding big keys
Bashredis-cli --bigkeys # Or sample with DEBUG OBJECT (avoid in production on every key) redis-cli --scan --pattern '*' | while read k; do redis-cli DEBUG OBJECT "$k"; done
--bigkeyssamples keys and reports the largest per type. Use during low traffic; it can be slow on huge datasets.
Core Mechanism / Behavior
- Hot key: Traffic is concentrated; one key (or slot) gets most requests. Single-threaded Redis makes that key a bottleneck. Replication helps only if reads go to replicas; otherwise you need key-level splitting or local cache.
- Big key: Serialization, network transfer, and allocation scale with value size. Big DEL/EXPIRE blocks the event loop; replicas can lag during full sync if they load a big key.
- Cluster: In cluster mode, each key belongs to one slot; a hot key means one node is hot. Splitting the logical key into multiple keys that hash to different slots spreads load (but complicates application logic).
| Problem | Detection (examples) | Mitigation (examples) |
|---|---|---|
| Hot key | Latency by key, proxy/APM stats, MONITOR (short) | Local cache, key split (replicas/shard), read replicas |
| Big key | --bigkeys, DEBUG OBJECT, memory by key | Shard key, compress, chunked delete, move to DB/blob store |
Key Rules
- Monitor which keys get the most traffic and which are largest; use --bigkeys and access metrics to find hot and big keys before they cause incidents.
- For hot keys: add a short-TTL local cache or split the key (e.g. by replica id or time) so traffic is spread across multiple keys/nodes.
- For big keys: avoid storing very large values or huge collections in one key; split or move to a store suited for large data; delete in chunks to avoid long blocks.
What's Next
See Cache-Aside and Caching Pitfalls for cache design. See Data Types for choosing structures that avoid big keys. See Rate Limiting for high-QPS counter patterns.