Load Balancing
📚 Overview
A load balancer distributes incoming traffic across multiple servers to:
- Prevent single point of failure
- Improve responsiveness
- Enable horizontal scaling
- Handle failover
🏗️ Architecture
🎯 Load Balancing Types
Layer 4 (Transport Layer)
Pros:
- Fast (no packet inspection)
- Protocol-agnostic
- Low latency
Cons:
- No content-based routing
- Limited visibility
Technologies:
- HAProxy
- NGINX (upstream)
- AWS NLB
- Azure Load Balancer
Layer 7 (Application Layer)
Pros:
- Content-aware routing
- SSL termination
- Request/response manipulation
Cons:
- Higher latency
- More compute intensive
Technologies:
- NGINX Plus
- AWS ALB
- Azure Application Gateway
- Google Cloud Load Balancing
⚙️ Load Balancing Algorithms
1. Round Robin
// Sequential distribution
servers = ['s1', 's2', 's3', 's4']
currentIndex = 0
function getNextServer() {
const server = servers[currentIndex];
currentIndex = (currentIndex + 1) % servers.length;
return server;
}
// Distribution: s1, s2, s3, s4, s1, s2, ...
Pros: Simple, fair distribution Cons: Doesn't consider server load
2. Weighted Round Robin
servers = [
{ name: 's1', weight: 3 },
{ name: 's2', weight: 2 },
{ name: 's3', weight: 1 }
]
// Distribution: s1, s1, s1, s2, s2, s3, s1, s1, ...
Use Case: Different server capacities
3. Least Connections
servers = [
{ name: 's1', connections: 5 },
{ name: 's2', connections: 2 },
{ name: 's3', connections: 8 }
]
function getLeastConnected() {
return servers.sort((a, b) => a.connections - b.connections)[0];
}
Pros: Accounts for current load Use Case: Varying request durations
4. IP Hash
function getServerByIP(clientIP) {
const hash = hashFunction(clientIP);
const index = hash % servers.length;
return servers[index];
}
Pros: Session persistence (sticky sessions) Cons: Uneven distribution with few clients
5. Least Response Time
servers = [
{ name: 's1', avgResponseTime: 50 },
{ name: 's2', avgResponseTime: 200 },
{ name: 's3', avgResponseTime: 100 }
]
function getFastestServer() {
return servers.sort((a, b) => a.avgResponseTime - b.avgResponseTime)[0];
}
Pros: Routes to fastest server Cons: Requires active monitoring
🔧 Health Checks
Passive Health Checks
// Detect failures from responses
if (response.status >= 500 || connectionError) {
markServerUnhealthy(server);
retryCount++;
}
Active Health Checks
// Periodically probe servers
async function healthCheck(server) {
try {
const response = await fetch(`${server}/health`, {
timeout: 5000
});
return response.ok;
} catch {
return false;
}
}
// Check every 5 seconds
setInterval(async () => {
for (const server of servers) {
const healthy = await healthCheck(server);
updateServerStatus(server, healthy);
}
}, 5000);
Health Check Configuration
upstream backend {
server 10.0.0.1:80 max_fails=3 fail_timeout=30s;
server 10.0.0.2:80 max_fails=3 fail_timeout=30s;
server 10.0.0.3:80 max_fails=3 fail_timeout=30s;
}
# /health endpoint
server {
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
}
🌐 Session Persistence (Sticky Sessions)
Why Needed?
- Stateful applications
- WebSocket connections
- In-memory sessions
Methods
1. IP-based:
const server = servers[hash(clientIP) % servers.length];
2. Cookie-based:
// Set cookie on first request
response.cookies.set('server_id', 's1');
// Route based on cookie
const serverId = request.cookies.get('server_id');
const server = servers.find(s => s.id === serverId);
3. Consistent Hashing:
function consistentHash(key, servers) {
const hash = hashFunction(key);
const sortedServers = servers.map(s => ({
server: s,
hash: hashFunction(s.id)
})).sort((a, b) => a.hash - b.hash);
// Find first server with hash > key hash
for (const s of sortedServers) {
if (s.hash > hash) return s.server;
}
return sortedServers[0].server; // Wrap around
}
🔒 SSL Termination
Benefits:
- Reduce backend load
- Centralized certificate management
- Simplified backend configuration
http {
server {
listen 443 ssl;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
location / {
proxy_pass http://backend; # HTTP to backend
}
}
}
🚦 Traffic Management
Circuit Breaker
class CircuitBreaker {
private failures = 0;
private lastFailureTime = 0;
private state: 'CLOSED' | 'OPEN' | 'HALF_OPEN' = 'CLOSED';
async execute(operation) {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime > this.timeout) {
this.state = 'HALF_OPEN';
} else {
throw new Error('Circuit breaker is OPEN');
}
}
try {
const result = await operation();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
onSuccess() {
this.failures = 0;
this.state = 'CLOSED';
}
onFailure() {
this.failures++;
this.lastFailureTime = Date.now();
if (this.failures >= this.threshold) {
this.state = 'OPEN';
}
}
}
Rate Limiting
// Token bucket
class RateLimiter {
private tokens: number;
private lastRefill: number;
constructor(private capacity: number, private refillRate: number) {
this.tokens = capacity;
this.lastRefill = Date.now();
}
allowRequest(): boolean {
this.refill();
if (this.tokens >= 1) {
this.tokens--;
return true;
}
return false;
}
private refill() {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(
this.capacity,
this.tokens + elapsed * this.refillRate
);
this.lastRefill = now;
}
}