Nginx & Load Balancing — Advanced Interview Handbook
A deep, easy-to-understand guide to Nginx, reverse proxying, and load balancing for senior interviews: how Nginx’s event model makes it fast, reverse proxy vs forward proxy, every load- balancing algorithm, health checks, TLS termination, caching, rate limiting, L4 vs L7, the tricky failure modes (sticky sessions, thundering herd, keepalive) — plus a deep Q&A bank.
1. What Nginx Is & Why It’s Fast
Section titled “1. What Nginx Is & Why It’s Fast”Nginx is a high-performance web server, reverse proxy, load balancer, and HTTP cache. It’s famous for handling tens of thousands of concurrent connections with low memory.
Why it’s fast: an event-driven, asynchronous, non-blocking architecture — unlike the old thread/process-per-connection model (e.g., classic Apache prefork) that runs out of memory under high concurrency.
Senior framing: “Nginx solved the C10k problem with an event loop: a few worker processes each handle thousands of connections asynchronously instead of one thread per connection. That’s why it stays fast and memory-light under massive concurrency.”
Common roles: serve static files, reverse-proxy to app servers, balance load across backends, terminate TLS, cache responses, rate-limit, and act as an API gateway.
2. The Event-Driven Architecture
Section titled “2. The Event-Driven Architecture” ┌─────────── Master process (reads config, manages workers) ───────────┐ │ │ Worker process 1 Worker process 2 Worker process N (event loop) (event loop) (event loop) ├─ conn A (idle) ├─ conn D (reading) ... ├─ conn B (writing) ├─ conn E (waiting backend) └─ conn C (waiting) └─ ... Each worker handles THOUSANDS of connections via non-blocking I/O (epoll/kqueue).- Master process — reads config, binds ports, spawns/manages worker processes (doesn’t handle requests). Enables zero-downtime reloads.
- Worker processes — usually one per CPU core; each runs an event loop handling many
connections with non-blocking I/O (
epollon Linux). - No thread-per-connection → tiny memory per connection, no context-switch storms.
“Nginx vs Apache?” Apache (prefork) uses a thread/process per connection — simple but memory-heavy and limited under high concurrency. Nginx uses an event loop — far better at many concurrent/idle/slow connections (keepalive, slow clients). Apache is more flexible per-directory (
.htaccess, embedded modules).
Trap: because a worker is a single event loop, blocking operations block all its connections. Nginx offloads heavy disk I/O to a thread pool; you keep workers non-blocking.
3. Reverse Proxy vs Forward Proxy
Section titled “3. Reverse Proxy vs Forward Proxy”| Forward proxy | Reverse proxy (Nginx’s main role) | |
|---|---|---|
| Sits in front of | Clients | Servers |
| Hides | The client from the server | The servers from the client |
| Example | Corporate web filter, VPN | Nginx in front of your app servers |
Forward: Client → [Forward Proxy] → Internet → Server (proxy acts for the client)Reverse: Client → Internet → [Reverse Proxy/Nginx] → App Servers (proxy acts for the servers)Why a reverse proxy: load balancing, TLS termination, caching, compression, security (hide backend topology, WAF), single entry point, request routing, rate limiting.
“A reverse proxy is the single front door to your backend — it centralizes TLS, load balancing, caching, and security so app servers stay simple and hidden.”
4. Load Balancing Fundamentals
Section titled “4. Load Balancing Fundamentals”A load balancer distributes incoming requests across multiple backend servers so no single one is overwhelmed — giving scalability, high availability, and redundancy.
┌──→ App Server 1Client → [Nginx LB] ┼──→ App Server 2 └──→ App Server 3 (one dies → traffic reroutes to the others)Benefits: horizontal scaling, fault tolerance (route around dead servers), zero-downtime deploys (drain a server), better utilization.
Prerequisite — stateless servers: load balancing works cleanly only if app servers are stateless (no session stored locally), so any server can handle any request. Otherwise you need sticky sessions (§8) or shared session state (Redis).
upstream backend { server app1:8080; server app2:8080; server app3:8080 backup; # only used if others are down}server { location / { proxy_pass http://backend; } }5. Load Balancing Algorithms
Section titled “5. Load Balancing Algorithms”| Algorithm | How it picks a server | Best for |
|---|---|---|
| Round Robin (default) | Next server in rotation | Uniform servers/requests |
| Weighted Round Robin | Rotation weighted by capacity (weight=3) | Mixed server sizes |
| Least Connections | Server with fewest active connections | Variable request durations |
| Weighted Least Connections | Least conns, capacity-adjusted | Mixed sizes + variable load |
| IP Hash | Hash of client IP → same server | Sticky by IP (session affinity) |
| Hash (key) | Hash of a key (URL, header) | Cache locality, consistent routing |
| Least Time (Nginx Plus) | Lowest latency + fewest conns | Latency-sensitive |
| Random (two choices) | Pick 2 at random, choose better | Large fleets (great in practice) |
upstream backend { least_conn; # algorithm server app1:8080 weight=3; server app2:8080;}“Round robin is fine for uniform workloads; least connections is better when request durations vary (slow requests pile up on one server otherwise). Power-of-two-choices random scales beautifully across large fleets. Consistent hashing when I need cache locality or sticky routing with minimal disruption when servers change.”
Consistent hashing — hashing keys onto a ring so adding/removing a server only remaps a small fraction of keys (vs
mod Nremapping everything). Critical for distributed caches and sharding.
6. Layer 4 vs Layer 7 Load Balancing
Section titled “6. Layer 4 vs Layer 7 Load Balancing”| L4 (Transport) | L7 (Application) | |
|---|---|---|
| Operates on | TCP/UDP (IP + port) | HTTP (URL, headers, cookies) |
| Sees content? | No (just packets) | Yes (full request) |
| Routing | By connection | By path/host/header/cookie |
| Speed | Faster (less work) | Slightly slower (parses HTTP) |
| Features | Raw throughput | TLS termination, caching, path routing, rewrites, WAF |
| Example | AWS NLB, stream module | Nginx HTTP proxy, AWS ALB |
“L4 vs L7 — when each?” L4 when you need raw speed and protocol-agnostic balancing (or to keep TLS end-to-end). L7 when you need content-based routing (
/api→ service A,/img→ service B), TLS termination, caching, or header manipulation. Nginx does both (http {}= L7,stream {}= L4).
“L4 is a fast packet router; L7 understands HTTP, so it can route by path/host, terminate TLS, cache, and rewrite — at a small CPU cost.”
7. Health Checks & Failover
Section titled “7. Health Checks & Failover”A load balancer must stop sending traffic to dead/unhealthy servers.
- Passive health checks (open-source Nginx) — mark a server down after N failed real requests
(
max_fails,fail_timeout), retry later. - Active health checks (Nginx Plus / many LBs) — periodically probe a
/healthendpoint proactively, before real users hit a broken server.
upstream backend { server app1:8080 max_fails=3 fail_timeout=30s; server app2:8080 max_fails=3 fail_timeout=30s;}Health endpoint design: a good
/healthchecks the server and critical dependencies (DB/cache) — but if it checks a shared DB, one DB blip can mark every server unhealthy at once (cascading outage). Separate liveness (am I up?) from readiness (can I serve?).
“Active checks catch failures before users do; passive checks need real failures first. I design health endpoints carefully so a shared-dependency hiccup doesn’t take the whole fleet out of rotation.”
8. Sticky Sessions (and why to avoid them)
Section titled “8. Sticky Sessions (and why to avoid them)”Sticky session (session affinity) = always route a given user to the same backend (via IP hash or a cookie). Needed when a server stores session state locally.
Why avoid them: they break clean load balancing (uneven load), defeat easy scaling/failover (if that server dies, the user’s session is lost), and complicate deploys. The better fix is stateless servers with shared session state (Redis) or JWT tokens the client carries.
“How do you handle sessions behind a load balancer?” Preferred: make servers stateless — store sessions in Redis or use stateless JWTs, so any server serves any request. Sticky sessions are a fallback for legacy stateful apps.
9. TLS/SSL Termination
Section titled “9. TLS/SSL Termination”TLS termination = the reverse proxy decrypts HTTPS, then talks plain HTTP to backends (inside a trusted network). Centralizes certificates and offloads crypto from app servers.
- TLS termination — decrypt at the proxy (simple, fast backends, but internal traffic is plaintext).
- TLS passthrough (L4) — forward encrypted traffic straight to backends (end-to-end encryption, no L7 features).
- TLS re-encryption / end-to-end — terminate at the proxy, then re-encrypt to backends (best of both for zero-trust).
“I terminate TLS at Nginx to centralize certs (and use OCSP stapling, HTTP/2, modern ciphers), then re-encrypt to backends in zero-trust environments. For pure L4 or compliance, TLS passthrough keeps it end-to-end.”
10. Caching & the Thundering Herd
Section titled “10. Caching & the Thundering Herd”Nginx can cache responses from backends, serving repeat requests itself (huge load reduction).
proxy_cache_path /var/cache/nginx keys_zone=mycache:10m max_size=1g inactive=60m;location / { proxy_cache mycache; proxy_cache_valid 200 10m; proxy_cache_use_stale error timeout updating; # serve stale on backend trouble add_header X-Cache-Status $upstream_cache_status;}- Cache key (default: scheme+host+URI); honors
Cache-Control/Expires. proxy_cache_use_stale— serve stale content when the backend is down/slow (graceful degradation).
Thundering herd / cache stampede — when a popular cached item expires, many requests hit the backend simultaneously to regenerate it. Nginx fixes this with
proxy_cache_lock(only one request fetches from the backend; others wait) andproxy_cache_use_stale updating(serve stale while one updates). Also stagger TTLs.
“I prevent cache stampedes with
proxy_cache_lock+use_stale updatingso a single request repopulates the cache while everyone else gets slightly-stale content instead of stampeding the origin.”
11. Rate Limiting & Connection Limiting
Section titled “11. Rate Limiting & Connection Limiting”Protect backends from abuse/overload.
- Request rate limiting —
limit_requses a leaky bucket: a steady rate (rate=10r/s) with an optional burst buffer. - Connection limiting —
limit_conncaps concurrent connections per key (IP).
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;location /api/ { limit_req zone=api burst=20 nodelay; # allow short bursts, then throttle}Leaky bucket vs token bucket: leaky bucket enforces a smooth constant output rate (Nginx’s model); token bucket allows bursts up to a bucket size then refills.
burst+nodelayapproximates bursty token-bucket behavior.
Rate-limit key trap: limiting by client IP punishes users behind a shared NAT/proxy and is fooled by spoofing/rotating IPs. Behind a proxy, use the real client IP from
X-Forwarded-For(configured viareal_ip), or limit by API key/user.
12. Compression, Buffering & Keepalive
Section titled “12. Compression, Buffering & Keepalive”- Gzip/Brotli compression — shrink responses (
gzip on); beware BREACH with compressed secret-bearing responses. - Proxy buffering — Nginx buffers the backend response so slow clients don’t tie up backend
workers (
proxy_buffering on). Disable for streaming/SSE. - Upstream keepalive — reuse connections to backends (
keepalive 32) to avoid TCP/TLS handshake per request. Must also setproxy_http_version 1.1and clear theConnectionheader, or keepalive silently won’t work. - Slowloris protection — timeouts (
client_body_timeout,client_header_timeout) defend against slow-drip connection-exhaustion attacks (Nginx’s event model already helps).
13. Nginx Config Essentials
Section titled “13. Nginx Config Essentials”http { upstream app { least_conn; server a:8080; server b:8080; keepalive 32; }
server { listen 443 ssl http2; server_name example.com; ssl_certificate /etc/ssl/cert.pem; ssl_certificate_key /etc/ssl/key.pem;
location /api/ { proxy_pass http://app; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } location /static/ { root /var/www; expires 30d; } }}locationmatching order: exact=> prefix^~> regex~/~*(first match wins) > longest prefix. A frequent source of “wrong block matched” bugs.- Zero-downtime reload:
nginx -s reload— master starts new workers with new config, drains old ones. No dropped connections. - Always forward
X-Forwarded-For/X-Forwarded-Protoso backends see the real client IP/scheme.
14. High Availability (no single point of failure)
Section titled “14. High Availability (no single point of failure)”The load balancer itself must not be a single point of failure.
- Active-passive: two Nginx nodes share a Virtual IP (VIP) via keepalived (VRRP); if the active dies, the passive takes the VIP over.
- Active-active: DNS round robin or an upstream L4 LB distributes across multiple Nginx nodes.
- Cloud: a managed LB (ALB/NLB) is itself redundant across AZs.
“How do you make the load balancer highly available?” Run at least two with a floating VIP (keepalived/VRRP) for active-passive failover, or multiple behind DNS/anycast for active-active. Never a single LB.
15. Advanced Gotchas (senior-level)
Section titled “15. Advanced Gotchas (senior-level)”- Upstream keepalive needs
proxy_http_version 1.1+ clearedConnectionheader — otherwise it silently doesn’t reuse connections. locationprecedence (=, ^~, regex, prefix) trips people up.- Sticky sessions undermine scaling/failover — prefer stateless + shared session store.
- Health checks on a shared DB can mark the whole fleet unhealthy at once.
- Cache stampede — use
proxy_cache_lock+use_stale updating. - Rate limiting by IP breaks behind NAT/CDN — use real client IP / API keys.
X-Forwarded-Forspoofing — only trust it from known proxies (set_real_ip_from).- L4 can’t do path routing or TLS termination; L7 can but costs CPU.
- Blocking a worker blocks thousands of connections — keep workers non-blocking.
- Buffering vs streaming — disable proxy buffering for SSE/WebSocket/large streams.
- WebSocket proxying needs
Upgrade/Connectionheaders explicitly set. - gzip + secrets = BREACH risk; don’t compress sensitive responses with reflected input.
“Senior signals: event-loop model (blocking is fatal), L4-vs-L7 trade-offs, stateless servers over sticky sessions, stampede protection, careful health-check & X-Forwarded-For handling, and the LB itself being HA.”
16. Interview Q&A Bank
Section titled “16. Interview Q&A Bank”Q: Why is Nginx faster than thread-per-connection servers?
Event-driven, non-blocking workers (one per core) each handle thousands of connections via epoll, instead of a thread/process per connection — far less memory and no context-switch storms (solves C10k).
Q: Reverse vs forward proxy?
Forward proxy fronts clients (hides them from servers); reverse proxy fronts servers (hides them from clients) and does LB, TLS termination, caching, and security. Nginx is mainly a reverse proxy.
Q: Name load-balancing algorithms and when to use them.
Round robin (uniform), weighted (mixed capacity), least connections (variable durations), IP/key hash (affinity/cache locality), power-of-two-choices random (large fleets), consistent hashing (minimal remap on changes).
Q: L4 vs L7 load balancing?
L4 routes by TCP/IP+port (fast, content-blind, keeps TLS end-to-end); L7 understands HTTP (path/host/ header routing, TLS termination, caching) at a CPU cost.
Q: What is consistent hashing and why?
Hash keys onto a ring so adding/removing a server remaps only a small fraction of keys (vs mod-N remapping everything). Essential for distributed caches/sharding and sticky routing.
Q: How do health checks work?
Passive (mark down after real failures: max_fails/fail_timeout) and active (proactive /health probes). Design endpoints so a shared-dependency blip doesn’t fail the whole fleet; separate liveness from readiness.
Q: How do you handle sessions behind an LB?
Make servers stateless with shared sessions (Redis) or JWTs. Sticky sessions (IP hash/cookie) are a legacy fallback that hurts scaling and failover.
Q: What is TLS termination?
The proxy decrypts HTTPS and talks HTTP (or re-encrypts) to backends — centralizing certs and offloading crypto. Passthrough (L4) keeps it end-to-end but loses L7 features.
Q: What’s a thundering herd / cache stampede and how do you prevent it in Nginx?
Many requests regenerate an expired hot cache entry at once, hammering the origin. Use proxy_cache_lock (one fetch, others wait) + proxy_cache_use_stale updating + TTL jitter.
Q: How does Nginx rate limit?
limit_req (leaky bucket, rate + burst) and limit_conn (concurrent connections per key). Beware limiting by IP behind NAT/CDN — use real client IP or API keys.
Q: How do you make the load balancer itself HA?
Active-passive with a floating VIP via keepalived/VRRP, or active-active behind DNS/anycast or a redundant cloud LB. Never a single LB.
Q: Common upstream keepalive gotcha?
keepalive to backends requires proxy_http_version 1.1 and clearing the Connection header; otherwise Nginx opens a new connection per request and you lose the benefit.
Q: How do you proxy WebSockets?
Set proxy_http_version 1.1 and forward Upgrade/Connection headers; disable buffering for the stream.
17. Cheat Sheet
Section titled “17. Cheat Sheet”- Nginx = event-driven reverse proxy / LB / web server / cache; workers (one/core) handle thousands of connections (solved C10k). Blocking a worker is fatal.
- Reverse proxy = single front door: LB, TLS, caching, security; hides backends.
- Algorithms: round robin, weighted, least connections, IP/key hash, power-of-two random, consistent hashing (minimal remap).
- L4 (fast, TCP, content-blind) vs L7 (HTTP-aware: path routing, TLS term, caching).
- Health checks: passive (max_fails) + active (/health); don’t let a shared DB fail the fleet.
- Sessions: prefer stateless + Redis/JWT over sticky sessions.
- TLS termination (or passthrough/re-encrypt) centralizes certs.
- Caching:
proxy_cache+proxy_cache_lock+use_stale updatingto beat stampedes. - Rate limit:
limit_req(leaky bucket, rate+burst),limit_conn; key by real client IP/API key. - Keepalive upstream needs HTTP/1.1 + cleared Connection header.
locationorder:=>^~> regex > prefix. ForwardX-Forwarded-For/Proto.- HA LB: keepalived/VRRP VIP (active-passive) or DNS/cloud LB (active-active).
- Zero-downtime reload:
nginx -s reload.
End of handbook. The senior signal: Nginx is an event loop (blocking kills it), and good load balancing means stateless backends, the right algorithm, careful health checks, stampede protection, and an HA load balancer.
© 2026 Saeid Darvishghazvini · Open source for the community · Privacy · · CC BY-NC 4.0