Rate Limiting with Redis (Token Bucket)
Rate limiting restricts how many requests a client can make in a time window. The token bucket model: a bucket holds tokens, refilled at a fixed rate; each request consumes one (or N) tokens; if no tokens are left, the request is rejected. This article shows how to implement a token-bucket (or a close variant) in Redis with Lua for atomicity and gives a simple table of alternatives.
Overview
- Token bucket: Bucket has a capacity C and refill rate R (tokens per second). Each request takes 1 token; if tokens ≥ 1, allow and decrement; otherwise deny. Refill: current_tokens = min(C, last_tokens + (now - last_time) * R).
- Redis role: Store per-client state (tokens left, last update time) in a key (e.g.
rate:user:1001). Use Lua to compute new token count and allow/deny in one atomic step. - Sliding window: Alternative is a sliding window (e.g. count of requests in last N seconds using ZSet or sorted list). Simpler but different semantics (fixed window can allow 2× at boundary; sliding is smoother but more state).
Example
Example 1: Token bucket in Lua (simplified)
- Stored value:
tokens,last_ts(e.g. "5,1707123456.123"). - Logic in Lua:
- Parse tokens and last_ts.
- Refill:
now - last_tsseconds passed → add(now - last_ts) * ratetokens, cap at capacity. - If tokens ≥ 1: allow, set tokens = tokens - 1, last_ts = now, return 1. Else return 0.
- Set key TTL to e.g.
capacity/rateseconds so unused buckets expire.
Example 2: Fixed window counter (simpler, not token bucket)
RedisINCR rate:user:1001 EXPIRE rate:user:1001 60 -- If INCR result > 100, reject. Else allow.
- Key per (client, window). Window = 60 s; limit 100. Problem: at window boundary two windows can each allow 100 (burst of 200). Token bucket or sliding window avoids that.
Example 3: Sliding window with ZSet
RedisZADD rate:user:1001 now request_id ZREMRANGEBYSCORE rate:user:1001 0 now-60 ZCARD rate:user:1001 EXPIRE rate:user:1001 61 -- If ZCARD <= 100, allow; else reject. request_id = unique per request (e.g. UUID).
- Score = timestamp; remove entries older than 60 s; count = requests in last 60 s. Gives a true sliding window; more keys and operations than a simple counter.
Core Mechanism / Behavior
- Atomicity: Use a single Lua script so refill, check, and consume (and optional TTL) happen in one call. Otherwise two requests can both see the same token count and both be allowed.
- Key design: One key per (client, limit scope), e.g.
rate:ip:1.2.3.4orrate:user:1001. TTL the key so inactive clients don’t keep state forever. - Refill: Token bucket refill is continuous (time-based). So burst is bounded by capacity; sustained rate is bounded by refill rate. Good for APIs that want both burst and steady limit.
| Method | Pros | Cons |
|---|---|---|
| Token bucket (Lua) | Smooth rate, burst control | More state and math |
| Fixed window counter | Simple, few ops | Burst at window boundary |
| Sliding window (ZSet) | Accurate sliding window | More memory and ZREMRANGEBYSCORE |
Key Rules
- Implement allow/deny and state update in one Lua script so there is no race between concurrent requests.
- Set a TTL on the rate-limit key so keys for inactive clients expire; choose TTL ≥ window length (e.g. 2× refill period for token bucket).
- Choose token bucket when you want burst + sustained limit; use sliding window when you need strict “last N seconds” semantics. Fixed window is acceptable when double burst at boundary is acceptable.
What's Next
See Distributed Lock with Redis for Lua and key design. See Data Types for ZSet and String. See Rate Limiter Design (system design) for high-level design and multi-dc considerations.