Scalability

📚 Overview

Scalability is the ability of a system to handle increased load by adding resources.

📈 Vertical vs Horizontal Scaling

Vertical Scaling (Scale Up)

Add resources to a single server:

Aspect	Details
Pros	Simple, no code changes
Cons	Limited by hardware, single point of failure
Use Case	Small apps, MVP

Horizontal Scaling (Scale Out)

Add more servers:

Aspect	Details
Pros	Unlimited (theoretically), fault tolerance
Cons	Complex, needs load balancer
Use Case	Large apps, high availability

🎯 Load Balancer

Distribute traffic across multiple servers.

Algorithms

Algorithm	Description	Use Case
Round Robin	Sequential	Similar server capacity
Least Connections	Server with fewest connections	Varying request duration
IP Hash	Same IP → Same server	Session persistence
Weighted	Proportional to capacity	Different server specs

Health Checks

// Liveness probe
GET /health/live
→ 200 OK if server is running

// Readiness probe
GET /health/ready
→ 200 OK if server can handle requests

// Example
{
  "status": "healthy",
  "checks": {
    "database": "ok",
    "redis": "ok",
    "external_api": "degraded"
  }
}

💾 Database Scaling

Read Replicas

// Write to master
await masterDb.query('INSERT INTO users ...');

// Read from replicas
await replicaDb.query('SELECT * FROM users ...');

Sharding

Sharding Strategies:

Strategy	Pros	Cons
Hash-based	Even distribution	Rebalancing hard
Range-based	Range queries easy	Hot spots
Directory-based	Flexible	Extra lookup

🗄️ Caching

Caching Strategies

Cache Aside (Lazy Loading):

async function getUser(id) {
  let user = await cache.get(`user:${id}`);
  if (!user) {
    user = await db.getUser(id);
    await cache.set(`user:${id}`, user, 3600);
  }
  return user;
}

Write Through:

async function updateUser(id, data) {
  await db.updateUser(id, data);
  await cache.set(`user:${id}`, data);
}

Write Back:

async function updateUser(id, data) {
  await cache.set(`user:${id}`, data);
  // Async write to DB
  queue.push({ type: 'update', id, data });
}

Cache Invalidation

Strategy	Description
TTL	Time-based expiration
LRU	Least Recently Used eviction
Write-through	Update cache on write
Write-back	Async write to DB

📦 CDNs (Content Delivery Networks)

Distribute static content globally.

Use cases:

Static assets (images, CSS, JS)
Video streaming
API responses (cacheable)

🔄 Asynchronous Processing

Message Queues

Use cases:

Email sending
Image processing
Data export
Notifications

Pub/Sub

Use cases:

Real-time updates
Event-driven architecture
Microservices communication

📊 Monitoring & Metrics

Key Metrics

// RED Method
Rate: Requests per second
Errors: Failed requests
Duration: Response time

// USE Method
Utilization: % of capacity used
Saturation: How overloaded
Errors: Error rate

Alerting

// Alert rules
- Error rate > 1% for 5 min
- P95 latency > 500ms for 5 min
- CPU > 80% for 10 min
- Memory > 90% for 5 min

🚀 Microservices vs Monolith

Aspect	Monolith	Microservices
Development	Simple	Complex
Deployment	All or nothing	Independent
Scaling	Scale entire app	Scale specific service
Communication	In-process	Network
Data	Single database	Per-service DB

When to split:

Different scalability needs
Different teams
Different deployment cycles
Isolated failure domains

❓ Interview Questions

Easy

Design a URL shortener
Design a rate limiter
Design a unique ID generator

Medium

Design a chat system
Design a news feed
Design a file storage system

Hard

Design YouTube
Design Google Search
Design a distributed database

📚 Overview​

📈 Vertical vs Horizontal Scaling​

Vertical Scaling (Scale Up)​

Horizontal Scaling (Scale Out)​

🎯 Load Balancer​

Algorithms​

Health Checks​

💾 Database Scaling​

Read Replicas​

Sharding​

🗄️ Caching​

Caching Strategies​

Cache Invalidation​

📦 CDNs (Content Delivery Networks)​

🔄 Asynchronous Processing​

Message Queues​

Pub/Sub​

📊 Monitoring & Metrics​

Key Metrics​

Alerting​

🚀 Microservices vs Monolith​

❓ Interview Questions​

Easy​

Medium​

Hard​

🔗 Related Topics​

📚 Overview

📈 Vertical vs Horizontal Scaling

Vertical Scaling (Scale Up)

Horizontal Scaling (Scale Out)

🎯 Load Balancer

Algorithms

Health Checks

💾 Database Scaling

Read Replicas

Sharding

🗄️ Caching

Caching Strategies

Cache Invalidation

📦 CDNs (Content Delivery Networks)

🔄 Asynchronous Processing

Message Queues

Pub/Sub

📊 Monitoring & Metrics

Key Metrics

Alerting

🚀 Microservices vs Monolith

❓ Interview Questions

Easy

Medium

Hard

🔗 Related Topics