Skip to main content

Load Balancing

πŸ“š Overview​

A load balancer distributes incoming traffic across multiple servers to:

  • Prevent single point of failure
  • Improve responsiveness
  • Enable horizontal scaling
  • Handle failover

πŸ—οΈ Architecture​

🎯 Load Balancing Types​

Layer 4 (Transport Layer)​

Pros:

  • Fast (no packet inspection)
  • Protocol-agnostic
  • Low latency

Cons:

  • No content-based routing
  • Limited visibility

Technologies:

  • HAProxy
  • NGINX (upstream)
  • AWS NLB
  • Azure Load Balancer

Layer 7 (Application Layer)​

Pros:

  • Content-aware routing
  • SSL termination
  • Request/response manipulation

Cons:

  • Higher latency
  • More compute intensive

Technologies:

  • NGINX Plus
  • AWS ALB
  • Azure Application Gateway
  • Google Cloud Load Balancing

βš™οΈ Load Balancing Algorithms​

1. Round Robin​

// Sequential distribution
servers = ['s1', 's2', 's3', 's4']
currentIndex = 0

function getNextServer() {
const server = servers[currentIndex];
currentIndex = (currentIndex + 1) % servers.length;
return server;
}

// Distribution: s1, s2, s3, s4, s1, s2, ...

Pros: Simple, fair distribution Cons: Doesn't consider server load

2. Weighted Round Robin​

servers = [
{ name: 's1', weight: 3 },
{ name: 's2', weight: 2 },
{ name: 's3', weight: 1 }
]

// Distribution: s1, s1, s1, s2, s2, s3, s1, s1, ...

Use Case: Different server capacities

3. Least Connections​

servers = [
{ name: 's1', connections: 5 },
{ name: 's2', connections: 2 },
{ name: 's3', connections: 8 }
]

function getLeastConnected() {
return servers.sort((a, b) => a.connections - b.connections)[0];
}

Pros: Accounts for current load Use Case: Varying request durations

4. IP Hash​

function getServerByIP(clientIP) {
const hash = hashFunction(clientIP);
const index = hash % servers.length;
return servers[index];
}

Pros: Session persistence (sticky sessions) Cons: Uneven distribution with few clients

5. Least Response Time​

servers = [
{ name: 's1', avgResponseTime: 50 },
{ name: 's2', avgResponseTime: 200 },
{ name: 's3', avgResponseTime: 100 }
]

function getFastestServer() {
return servers.sort((a, b) => a.avgResponseTime - b.avgResponseTime)[0];
}

Pros: Routes to fastest server Cons: Requires active monitoring

πŸ”§ Health Checks​

Passive Health Checks​

// Detect failures from responses
if (response.status >= 500 || connectionError) {
markServerUnhealthy(server);
retryCount++;
}

Active Health Checks​

// Periodically probe servers
async function healthCheck(server) {
try {
const response = await fetch(`${server}/health`, {
timeout: 5000
});
return response.ok;
} catch {
return false;
}
}

// Check every 5 seconds
setInterval(async () => {
for (const server of servers) {
const healthy = await healthCheck(server);
updateServerStatus(server, healthy);
}
}, 5000);

Health Check Configuration​

upstream backend {
server 10.0.0.1:80 max_fails=3 fail_timeout=30s;
server 10.0.0.2:80 max_fails=3 fail_timeout=30s;
server 10.0.0.3:80 max_fails=3 fail_timeout=30s;
}

# /health endpoint
server {
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
}

🌐 Session Persistence (Sticky Sessions)​

Why Needed?​

  • Stateful applications
  • WebSocket connections
  • In-memory sessions

Methods​

1. IP-based:

const server = servers[hash(clientIP) % servers.length];

2. Cookie-based:

// Set cookie on first request
response.cookies.set('server_id', 's1');

// Route based on cookie
const serverId = request.cookies.get('server_id');
const server = servers.find(s => s.id === serverId);

3. Consistent Hashing:

function consistentHash(key, servers) {
const hash = hashFunction(key);
const sortedServers = servers.map(s => ({
server: s,
hash: hashFunction(s.id)
})).sort((a, b) => a.hash - b.hash);

// Find first server with hash > key hash
for (const s of sortedServers) {
if (s.hash > hash) return s.server;
}
return sortedServers[0].server; // Wrap around
}

πŸ”’ SSL Termination​

Benefits:

  • Reduce backend load
  • Centralized certificate management
  • Simplified backend configuration
http {
server {
listen 443 ssl;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;

location / {
proxy_pass http://backend; # HTTP to backend
}
}
}

🚦 Traffic Management​

Circuit Breaker​

class CircuitBreaker {
private failures = 0;
private lastFailureTime = 0;
private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED';

async execute(operation) {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime > this.timeout) {
this.state = 'HALF_OPEN';
} else {
throw new Error('Circuit breaker is OPEN');
}
}

try {
const result = await operation();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}

onSuccess() {
this.failures = 0;
this.state = 'CLOSED';
}

onFailure() {
this.failures++;
this.lastFailureTime = Date.now();
if (this.failures >= this.threshold) {
this.state = 'OPEN';
}
}
}

Rate Limiting​

// Token bucket
class RateLimiter {
private tokens: number;
private lastRefill: number;

constructor(private capacity: number, private refillRate: number) {
this.tokens = capacity;
this.lastRefill = Date.now();
}

allowRequest(): boolean {
this.refill();
if (this.tokens >= 1) {
this.tokens--;
return true;
}
return false;
}

private refill() {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(
this.capacity,
this.tokens + elapsed * this.refillRate
);
this.lastRefill = now;
}
}

☁️ Cloud Load Balancers​

AWS​

ServiceLayerFeatures
CLBLayer 4/7Legacy, basic
ALBLayer 7Path-based routing, WAF
NLBLayer 4Ultra-low latency, static IPs
GLBGlobalAnycast, cross-region

Azure​

ServiceLayerFeatures
Basic LBLayer 4Simple, low cost
Standard LBLayer 4AZ support, SLA
App GatewayLayer 7WAF, path-based

GCP​

ServiceLayerFeatures
Network LBLayer 4Global anycast
HTTP(S) LBLayer 7Cloud CDN integration

❓ Interview Questions​

Easy​

  1. What is the difference between L4 and L7 load balancers?
  2. How do you handle server failures?
  3. What is a health check?

Medium​

  1. Design a rate limiter
  2. How does SSL termination work?
  3. When to use sticky sessions?

Hard​

  1. Design a load balancer from scratch
  2. How to achieve zero-downtime deployments?
  3. Design a global load balancing system