Chuyển tới nội dung chính

Scalability

📚 Overview

Scalability is the ability of a system to handle increased load by adding resources.

📈 Vertical vs Horizontal Scaling

Vertical Scaling (Scale Up)

Add resources to a single server:

AspectDetails
ProsSimple, no code changes
ConsLimited by hardware, single point of failure
Use CaseSmall apps, MVP

Horizontal Scaling (Scale Out)

Add more servers:

AspectDetails
ProsUnlimited (theoretically), fault tolerance
ConsComplex, needs load balancer
Use CaseLarge apps, high availability

🎯 Load Balancer

Distribute traffic across multiple servers.

Algorithms

AlgorithmDescriptionUse Case
Round RobinSequentialSimilar server capacity
Least ConnectionsServer with fewest connectionsVarying request duration
IP HashSame IP → Same serverSession persistence
WeightedProportional to capacityDifferent server specs

Health Checks

// Liveness probe
GET /health/live
200 OK if server is running

// Readiness probe
GET /health/ready
200 OK if server can handle requests

// Example
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok",
"external_api": "degraded"
}
}

💾 Database Scaling

Read Replicas

// Write to master
await masterDb.query('INSERT INTO users ...');

// Read from replicas
await replicaDb.query('SELECT * FROM users ...');

Sharding

Sharding Strategies:

StrategyProsCons
Hash-basedEven distributionRebalancing hard
Range-basedRange queries easyHot spots
Directory-basedFlexibleExtra lookup

🗄️ Caching

Caching Strategies

Cache Aside (Lazy Loading):

async function getUser(id) {
let user = await cache.get(`user:${id}`);
if (!user) {
user = await db.getUser(id);
await cache.set(`user:${id}`, user, 3600);
}
return user;
}

Write Through:

async function updateUser(id, data) {
await db.updateUser(id, data);
await cache.set(`user:${id}`, data);
}

Write Back:

async function updateUser(id, data) {
await cache.set(`user:${id}`, data);
// Async write to DB
queue.push({ type: 'update', id, data });
}

Cache Invalidation

StrategyDescription
TTLTime-based expiration
LRULeast Recently Used eviction
Write-throughUpdate cache on write
Write-backAsync write to DB

📦 CDNs (Content Delivery Networks)

Distribute static content globally.

Use cases:

  • Static assets (images, CSS, JS)
  • Video streaming
  • API responses (cacheable)

🔄 Asynchronous Processing

Message Queues

Use cases:

  • Email sending
  • Image processing
  • Data export
  • Notifications

Pub/Sub

Use cases:

  • Real-time updates
  • Event-driven architecture
  • Microservices communication

📊 Monitoring & Metrics

Key Metrics

// RED Method
Rate: Requests per second
Errors: Failed requests
Duration: Response time

// USE Method
Utilization: % of capacity used
Saturation: How overloaded
Errors: Error rate

Alerting

// Alert rules
- Error rate > 1% for 5 min
- P95 latency > 500ms for 5 min
- CPU > 80% for 10 min
- Memory > 90% for 5 min

🚀 Microservices vs Monolith

AspectMonolithMicroservices
DevelopmentSimpleComplex
DeploymentAll or nothingIndependent
ScalingScale entire appScale specific service
CommunicationIn-processNetwork
DataSingle databasePer-service DB

When to split:

  • Different scalability needs
  • Different teams
  • Different deployment cycles
  • Isolated failure domains

❓ Interview Questions

Easy

  1. Design a URL shortener
  2. Design a rate limiter
  3. Design a unique ID generator

Medium

  1. Design a chat system
  2. Design a news feed
  3. Design a file storage system

Hard

  1. Design YouTube
  2. Design Google Search
  3. Design a distributed database