API Gateway Basics
The API gateway is the single entry point for external traffic. It handles routing, authentication, rate limiting, protocol translation, monitoring, and more. This article explains gateway responsibilities, common patterns, deployment considerations, and a reference table.
Overview
- Routing: Forward requests to backend services by path, header, or parameter. Supports multi-version, canary, and A/B routing.
- Authentication: Validate token, API key, or certificate; inject user identity on success so backends do not need to re-validate.
- Rate limiting: By IP, userId, or API; protect backends from overload and abuse.
- Protocol translation: HTTP/HTTPS at the edge; convert to gRPC, Dubbo, or other internal protocols.
- Monitoring: Unified logs, metrics, traces; collect volume, latency, and error rate at a single point.
Example
Example 1: Responsibilities
| Responsibility | Description |
|---|---|
| Routing | Forward to correct service or instance |
| Auth | Validate identity (JWT, OAuth, API Key) |
| Authorization | Validate permissions (RBAC, ACL) |
| Rate limit | Control QPS and concurrency |
| Circuit breaker | Fail fast when downstream is unhealthy |
| Monitoring | Logs, metrics, traces |
| Protocol translation | HTTP → gRPC, REST → internal RPC |
Example 2: Common products
-
Kong: Plugin-based; Lua; good ecosystem.
-
APISIX: Dynamic routing; etcd; high performance.
-
Spring Cloud Gateway: Java; reactive; integrates with Spring ecosystem.
-
Nginx + Lua: Flexible; requires more custom code.
-
AWS API Gateway: Managed; serverless; Lambda integration.
-
Choose by performance, ecosystem, cloud integration, and operational overhead.
Example 3: Relation to BFF
- Gateway: Cross-cutting concerns (routing, auth, rate limit, protocol). Stateless; scales horizontally.
- BFF (Backend for Frontend): Aggregates and adapts APIs per client (web, mobile). Can sit behind the gateway or as a separate layer.
- Architecture: Gateway → BFF(s) → backend services, or Gateway → services directly when BFF is not needed.
Example 4: Routing by version
YAMLroutes: - path: /api/users backend: user-service-v1 match: headers: X-Version: "1" - path: /api/users backend: user-service-v2 match: headers: X-Version: "2"
- Enables canary and gradual rollout by routing a subset of traffic to new versions.
Example 5: Rate limiting at the gateway
- Limit by API key, IP, or user. Return 429 when exceeded; include
Retry-Afterheader. - Protects backends; ensures fair usage; prevents abuse and DoS.
Example 6: Auth flow
- Client sends token (header or cookie).
- Gateway validates token (signature, expiry, claims).
- Gateway injects user id / roles into headers (e.g.
X-User-Id,X-Roles). - Backend trusts gateway; no re-validation for non-sensitive paths.
- For sensitive operations, backend may perform additional checks.
Core Mechanism / Behavior
- Routing: Match path, header, or parameter; forward to backend via service discovery or static config. Can use weighted routing for canary.
- Auth: Validate JWT/OAuth; extract claims; propagate identity via headers. Backends trust gateway-injected headers.
- Rate limit: Token bucket or sliding window; shared state (e.g. Redis) for distributed limit; reject or queue on excess.
- Stateless: Gateway should not store session state; scale horizontally; use external store (Redis) for rate limit and cache.
Deployment and Scaling
- High availability: Multiple gateway instances behind a load balancer; no single point of failure.
- Horizontal scaling: Stateless design; add instances to handle more traffic.
- Performance: Gateway can be a bottleneck; choose a performant implementation; offload TLS termination, caching, compression.
- Failover: Health checks; remove unhealthy instances from load balancer; circuit breaker for downstream.
Common Pitfalls
- Gateway as bottleneck: If the gateway is slow or overloaded, all traffic suffers. Profile and optimize; consider caching and connection pooling to backends.
- Auth bypass: Misconfigured routes (e.g.
permitAllon admin paths) can expose sensitive endpoints. Audit all routes and default to deny. - Rate limit too strict: Can block legitimate users. Tune by API and user tier; use different limits for different endpoints.
- Trusting headers blindly: Backends must not trust client-sent
X-User-Id; only trust headers injected by the gateway after auth. Validate that the gateway strips or overwrites client headers.
Key Rules
- Gateway must be highly available and scalable; multi-instance, stateless, horizontal scaling.
- Centralize auth at the gateway; backends trust injected identity; add extra checks only for highly sensitive paths.
- Rate limit and circuit breaker by API, user, source; align with business SLA.
- Monitor the gateway itself; it is critical path; latency and errors here affect all clients.
What's Next
See Rate Limiter Design, Service Discovery. See Circuit Breaker and Distributed Tracing for gateway resilience and observability.