Skip to main content

Scalability

πŸ“š Overview​

Scalability is the ability of a system to handle increased load by adding resources.

πŸ“ˆ Vertical vs Horizontal Scaling​

Vertical Scaling (Scale Up)​

Add resources to a single server:

AspectDetails
ProsSimple, no code changes
ConsLimited by hardware, single point of failure
Use CaseSmall apps, MVP

Horizontal Scaling (Scale Out)​

Add more servers:

AspectDetails
ProsUnlimited (theoretically), fault tolerance
ConsComplex, needs load balancer
Use CaseLarge apps, high availability

🎯 Load Balancer​

Distribute traffic across multiple servers.

Algorithms​

AlgorithmDescriptionUse Case
Round RobinSequentialSimilar server capacity
Least ConnectionsServer with fewest connectionsVarying request duration
IP HashSame IP β†’ Same serverSession persistence
WeightedProportional to capacityDifferent server specs

Health Checks​

// Liveness probe
GET /health/live
β†’ 200 OK if server is running

// Readiness probe
GET /health/ready
β†’ 200 OK if server can handle requests

// Example
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok",
"external_api": "degraded"
}
}

πŸ’Ύ Database Scaling​

Read Replicas​

// Write to master
await masterDb.query('INSERT INTO users ...');

// Read from replicas
await replicaDb.query('SELECT * FROM users ...');

Sharding​

Sharding Strategies:

StrategyProsCons
Hash-basedEven distributionRebalancing hard
Range-basedRange queries easyHot spots
Directory-basedFlexibleExtra lookup

πŸ—„οΈ Caching​

Caching Strategies​

Cache Aside (Lazy Loading):

async function getUser(id) {
let user = await cache.get(`user:${id}`);
if (!user) {
user = await db.getUser(id);
await cache.set(`user:${id}`, user, 3600);
}
return user;
}

Write Through:

async function updateUser(id, data) {
await db.updateUser(id, data);
await cache.set(`user:${id}`, data);
}

Write Back:

async function updateUser(id, data) {
await cache.set(`user:${id}`, data);
// Async write to DB
queue.push({ type: 'update', id, data });
}

Cache Invalidation​

StrategyDescription
TTLTime-based expiration
LRULeast Recently Used eviction
Write-throughUpdate cache on write
Write-backAsync write to DB

πŸ“¦ CDNs (Content Delivery Networks)​

Distribute static content globally.

Use cases:

  • Static assets (images, CSS, JS)
  • Video streaming
  • API responses (cacheable)

πŸ”„ Asynchronous Processing​

Message Queues​

Use cases:

  • Email sending
  • Image processing
  • Data export
  • Notifications

Pub/Sub​

Use cases:

  • Real-time updates
  • Event-driven architecture
  • Microservices communication

πŸ“Š Monitoring & Metrics​

Key Metrics​

// RED Method
Rate: Requests per second
Errors: Failed requests
Duration: Response time

// USE Method
Utilization: % of capacity used
Saturation: How overloaded
Errors: Error rate

Alerting​

// Alert rules
- Error rate > 1% for 5 min
- P95 latency > 500ms for 5 min
- CPU > 80% for 10 min
- Memory > 90% for 5 min

πŸš€ Microservices vs Monolith​

AspectMonolithMicroservices
DevelopmentSimpleComplex
DeploymentAll or nothingIndependent
ScalingScale entire appScale specific service
CommunicationIn-processNetwork
DataSingle databasePer-service DB

When to split:

  • Different scalability needs
  • Different teams
  • Different deployment cycles
  • Isolated failure domains

❓ Interview Questions​

Easy​

  1. Design a URL shortener
  2. Design a rate limiter
  3. Design a unique ID generator

Medium​

  1. Design a chat system
  2. Design a news feed
  3. Design a file storage system

Hard​

  1. Design YouTube
  2. Design Google Search
  3. Design a distributed database