Retry, Backoff & Deduplication
In distributed systems, retry improves recovery from transient failures, backoff reduces retry storms, and deduplication ensures idempotency and avoids duplicate processing. This article explains how they relate and common strategies with examples and a summary table.
Overview
- Retry: Try again after failure. Suited for network jitter, short timeouts, temporary downstream unavailability. Avoid or limit retry for non-idempotent operations.
- Backoff: Increase delay between retries (exponential backoff) or add random jitter to avoid synchronized retries ("thundering herd"). E.g. 1s, 2s, 4s, 8s + random(0, 1s).
- Deduplication: Process each request/message only once. Use a unique id to track processed, or business unique keys, or state machines. Works with idempotency design.
Example
Example 1: Exponential backoff + jitter
Javaint delay = baseDelay * (1 << attempt) + random(0, jitter); Thread.sleep(delay);
- 1st retry: baseDelay; 2nd: 2×baseDelay; 3rd: 4×baseDelay. Jitter avoids synchronized retries.
Example 2: Retry conditions
| Retry | Do not retry |
|---|---|
| Timeout, connection failure, 5xx | 4xx, business error, invalid params |
| Transient network failure | Auth failure, resource not found |
- 4xx usually means client error; retry rarely helps. 5xx may be temporary; retry is reasonable.
Example 3: Deduplication and idempotency
- Retry can cause the same request to arrive multiple times. The server must deduplicate or be idempotent. Use idempotency key, unique business key, or deduplication table; return existing result for duplicates, do not re-execute.
Example 4: Strategy summary
| Strategy | Description |
|---|---|
| Max retries | Limit retries to avoid infinite loops |
| Backoff | Exponential + jitter to reduce thundering herd |
| Retryable errors | Retry only timeout, 5xx, etc. |
| Deduplication / idempotency | Server must support; avoid duplicate side effects |
Core Mechanism / Behavior
- Exponential backoff: Each attempt waits longer (e.g. 2^n). Smooths load during recovery.
- Jitter: Random component spreads retries across clients and time.
- Deduplication: Store processed ids; reject or return cached result for repeats.
Key Rules
- Use backoff: At least add jitter; exponential backoff is smoother.
- Retry only retryable errors; for non-idempotent operations use
retries=0or explicit no-retry. - Deduplication / idempotency is a precondition for retry; otherwise retry increases duplicate processing risk.
What's Next
See Idempotency Design, Timeout/Retry/Fallback, Circuit Breaker. See Message Idempotency for message consumption.