High Concurrency Toolkit
Common high-concurrency techniques: caching, async, peak shaving, rate limiting, degradation, sharding, read/write split, lock-free/optimistic locking. This article surveys these techniques and when to use them with a quick reference table.
Overview
- Cache: Reduce DB and remote calls; local cache, distributed cache (Redis). Watch for penetration, breakdown, avalanche.
- Async: Offload non-critical path; MQ, CompletableFuture. Improves throughput and reduces latency.
- Peak shaving: Queue burst traffic; MQ, local queue. Downstream consumes at its own pace.
- Rate limit: Control QPS/concurrency; token bucket, sliding window. Protects downstream and self.
- Degradation: Turn off non-core features when overloaded; return defaults, use cache, circuit breaker. Keep core available.
- Sharding: Horizontal split; shard by user_id, order_id, etc. Spread load.
- Read/write split: Reads from replicas, writes to primary. Higher read throughput.
- Lock-free/optimistic lock: Reduce lock contention; CAS, version. Higher concurrency.
Example
Example 1: Techniques and typical use
| Technique | Typical use |
|---|---|
| Cache | Hot reads, config, session |
| Async | Notify, log, non-core logic |
| Peak shaving | Flash sale, events, batch jobs |
| Rate limit | API protection, anti-abuse, quota |
| Degradation | Overload, downstream failure |
| Sharding | Single table/DB bottleneck |
| Read/write split | Read-heavy, write-light |
| Lock-free | Counters, state updates |
Example 2: Combined example (flash sale)
- Rate limit (entry) → queue to absorb burst (orders) → cache (inventory/product) → async (deduct, publish MQ) → degradation (sold out when oversold).
Example 3: Adoption order
- Optimize single instance first (index, connection pool, async, cache).
- Scale horizontally (multi-instance, load balance).
- Add sharding and read/write split.
- Add rate limit, degradation, circuit breaker for protection.
Core Mechanism / Behavior
- Cache: Reduces load; must handle invalidation, consistency, and failure modes.
- Async: Decouples producer and consumer; MQ provides durability and backpressure.
- Sharding: Adds routing, cross-shard queries, and distributed transactions complexity.
Key Rules
- Choose technique by bottleneck; measure first, then optimize; avoid premature optimization.
- Cache, async, rate limit, degradation are often used together; design for consistency and degradation.
- Sharding adds ops and distributed transaction complexity; use when a single DB/table is clearly the bottleneck.
What's Next
See Cache-Aside, Rate Limiting, Circuit Breaker. See Index, Pagination for DB optimization. See ConcurrentHashMap, CompletableFuture for concurrency.