Retry, Backoff & Deduplication

In distributed systems, retry improves recovery from transient failures, backoff reduces retry storms, and deduplication ensures idempotency and avoids duplicate processing. This article explains how they relate and common strategies with examples and a summary table.

Overview

  • Retry: Try again after failure. Suited for network jitter, short timeouts, temporary downstream unavailability. Avoid or limit retry for non-idempotent operations.
  • Backoff: Increase delay between retries (exponential backoff) or add random jitter to avoid synchronized retries ("thundering herd"). E.g. 1s, 2s, 4s, 8s + random(0, 1s).
  • Deduplication: Process each request/message only once. Use a unique id to track processed, or business unique keys, or state machines. Works with idempotency design.

Example

Example 1: Exponential backoff + jitter

Java
int delay = baseDelay * (1 << attempt) + random(0, jitter);
Thread.sleep(delay);
  • 1st retry: baseDelay; 2nd: 2×baseDelay; 3rd: 4×baseDelay. Jitter avoids synchronized retries.

Example 2: Retry conditions

RetryDo not retry
Timeout, connection failure, 5xx4xx, business error, invalid params
Transient network failureAuth failure, resource not found
  • 4xx usually means client error; retry rarely helps. 5xx may be temporary; retry is reasonable.

Example 3: Deduplication and idempotency

  • Retry can cause the same request to arrive multiple times. The server must deduplicate or be idempotent. Use idempotency key, unique business key, or deduplication table; return existing result for duplicates, do not re-execute.

Example 4: Strategy summary

StrategyDescription
Max retriesLimit retries to avoid infinite loops
BackoffExponential + jitter to reduce thundering herd
Retryable errorsRetry only timeout, 5xx, etc.
Deduplication / idempotencyServer must support; avoid duplicate side effects

Core Mechanism / Behavior

  • Exponential backoff: Each attempt waits longer (e.g. 2^n). Smooths load during recovery.
  • Jitter: Random component spreads retries across clients and time.
  • Deduplication: Store processed ids; reject or return cached result for repeats.

Key Rules

  • Use backoff: At least add jitter; exponential backoff is smoother.
  • Retry only retryable errors; for non-idempotent operations use retries=0 or explicit no-retry.
  • Deduplication / idempotency is a precondition for retry; otherwise retry increases duplicate processing risk.

What's Next

See Idempotency Design, Timeout/Retry/Fallback, Circuit Breaker. See Message Idempotency for message consumption.