URL Shortener Design

A URL shortener maps long URLs to short codes (e.g. bit.ly/abc123). Visiting the short URL redirects to the original. Core pieces: short code generation, storage, redirect. This article explains design points with a reference table.

Overview

  • Short code: Auto-increment ID to base-62 (0-9a-zA-Z), random string, or hash + collision handling. Must be short (6–8 chars), unique, scalable.
  • Storage: Mapping short code → long URL; K-V store (Redis, DB). Can add TTL, click stats, creator, etc.
  • Redirect: GET short URL → lookup mapping → 302 redirect to long URL. Use 301 (permanent) or 302 (allows stats, target change).
  • High concurrency: Read-heavy; cache hot links (Redis); DB for durability and backup.

Example

Example 1: Flow

Plain text
Create: long URL → generate short code → store short→long → return short URL
Visit: short URL → lookup short→long → 302 redirect

Example 2: Short code generation

MethodProsCons
Auto-increment + base-62No collision, orderedNeeds distributed ID
RandomSimpleCollision needs retry
HashSame URL → same codeCollision, predictable

Example 3: Storage and cache

  • DB: short code (PK), long URL, created_at, TTL, stats. Redis: short code → long URL; TTL aligned with DB; on miss, load from DB and populate cache.

Example 4: Scale considerations

  • Distributed ID (Snowflake, segment) for unique codes; collision retry or different algorithm.
  • CDN for short URL endpoints if traffic is high.

Core Mechanism / Behavior

  • Base-62: Encode numeric ID with 0-9, a-z, A-Z for compact representation.
  • Redirect: 301 cached by browser; 302 allows per-request handling and stats.
  • Duplicate URLs: Same long URL can map to same short code (hash) or different (random/increment); choose by product needs.

Key Rules

  • Short codes must be unique; use Snowflake, segments, etc. for distributed generation; retry or change algo on collision.
  • 302 allows stats and target changes; 301 is cached by browser and harder to track.
  • Abuse control: Rate limit by IP, user; filter malicious or sensitive URLs.

What's Next

See Pagination (for lists), Cache Strategy, Rate Limiting. See Distributed ID for code generation.