Hot Key / Big Key - Detection & Mitigation

A hot key is a key that receives a very high read or write rate; a big key is a key whose value (or aggregate size in Hash/List/Set/ZSet) is very large. Both can cause high latency, uneven load, and even OOM or node failure. This article explains how to detect and mitigate them.

Overview

Hot key: One (or few) keys get most of the traffic. A single Redis node or single shard becomes the bottleneck; latency spikes. Causes: popular item cache, counter, or lock.
Big key: One key holds a huge value (e.g. 10 MB string) or many elements (e.g. list with 1M elements). Single operations are slow (network, serialization), and DEL or expiry can block.
Detection: Use Redis commands (e.g. --bigkeys), monitor latency and key access (client-side or proxy), or use Redis’s latency and slow-log features. APM or proxy can show which keys are hit most.
Mitigation: For hot keys: replicate read (local cache, multiple replicas), or split the key (e.g. shard by client or time window). For big keys: split into multiple keys, compress, or avoid storing huge values in one key.

Example

Example 1: Hot key — single product cache

Plain text
Key product:123 gets 100k QPS → one shard or node is overloaded.

Mitigation:
- Local cache in app (e.g. Caffeine) so only a fraction of requests hit Redis.
- Or split key: product:123:replica_0 … product:123:replica_N, choose by hash(123) % N or random; reduces QPS per key.
- Or use a read replica and send reads to replicas (if topology allows).

Example 2: Big key — huge list

Redis
LRANGE big:list 0 -1   -- 1M elements, blocks and transfers huge reply
DEL big:list           -- can block for a long time

Mitigation:
- Don’t store 1M elements in one list. Shard by range or ID (e.g. big:list:0, big:list:1).
- Delete in chunks (e.g. LTRIM or pipeline DEL of sub-keys) to avoid long blocking.
- Or use a different store (e.g. DB or object store) for large collections.

Example 3: Finding big keys

Bash
redis-cli --bigkeys
# Or sample with DEBUG OBJECT (avoid in production on every key)
redis-cli --scan --pattern '*' | while read k; do redis-cli DEBUG OBJECT "$k"; done

--bigkeys samples keys and reports the largest per type. Use during low traffic; it can be slow on huge datasets.

Core Mechanism / Behavior

Hot key: Traffic is concentrated; one key (or slot) gets most requests. Single-threaded Redis makes that key a bottleneck. Replication helps only if reads go to replicas; otherwise you need key-level splitting or local cache.
Big key: Serialization, network transfer, and allocation scale with value size. Big DEL/EXPIRE blocks the event loop; replicas can lag during full sync if they load a big key.
Cluster: In cluster mode, each key belongs to one slot; a hot key means one node is hot. Splitting the logical key into multiple keys that hash to different slots spreads load (but complicates application logic).

Problem	Detection (examples)	Mitigation (examples)
Hot key	Latency by key, proxy/APM stats, MONITOR (short)	Local cache, key split (replicas/shard), read replicas
Big key	--bigkeys, DEBUG OBJECT, memory by key	Shard key, compress, chunked delete, move to DB/blob store

Key Rules

Monitor which keys get the most traffic and which are largest; use --bigkeys and access metrics to find hot and big keys before they cause incidents.
For hot keys: add a short-TTL local cache or split the key (e.g. by replica id or time) so traffic is spread across multiple keys/nodes.
For big keys: avoid storing very large values or huge collections in one key; split or move to a store suited for large data; delete in chunks to avoid long blocks.

What's Next

See Cache-Aside and Caching Pitfalls for cache design. See Data Types for choosing structures that avoid big keys. See Rate Limiting for high-QPS counter patterns.