Scalability
π Overviewβ
Scalability is the ability of a system to handle increased load by adding resources.
π Vertical vs Horizontal Scalingβ
Vertical Scaling (Scale Up)β
Add resources to a single server:
| Aspect | Details |
|---|---|
| Pros | Simple, no code changes |
| Cons | Limited by hardware, single point of failure |
| Use Case | Small apps, MVP |
Horizontal Scaling (Scale Out)β
Add more servers:
| Aspect | Details |
|---|---|
| Pros | Unlimited (theoretically), fault tolerance |
| Cons | Complex, needs load balancer |
| Use Case | Large apps, high availability |
π― Load Balancerβ
Distribute traffic across multiple servers.
Algorithmsβ
| Algorithm | Description | Use Case |
|---|---|---|
| Round Robin | Sequential | Similar server capacity |
| Least Connections | Server with fewest connections | Varying request duration |
| IP Hash | Same IP β Same server | Session persistence |
| Weighted | Proportional to capacity | Different server specs |
Health Checksβ
// Liveness probe
GET /health/live
β 200 OK if server is running
// Readiness probe
GET /health/ready
β 200 OK if server can handle requests
// Example
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok",
"external_api": "degraded"
}
}
πΎ Database Scalingβ
Read Replicasβ
// Write to master
await masterDb.query('INSERT INTO users ...');
// Read from replicas
await replicaDb.query('SELECT * FROM users ...');
Shardingβ
Sharding Strategies:
| Strategy | Pros | Cons |
|---|---|---|
| Hash-based | Even distribution | Rebalancing hard |
| Range-based | Range queries easy | Hot spots |
| Directory-based | Flexible | Extra lookup |
ποΈ Cachingβ
Caching Strategiesβ
Cache Aside (Lazy Loading):
async function getUser(id) {
let user = await cache.get(`user:${id}`);
if (!user) {
user = await db.getUser(id);
await cache.set(`user:${id}`, user, 3600);
}
return user;
}
Write Through:
async function updateUser(id, data) {
await db.updateUser(id, data);
await cache.set(`user:${id}`, data);
}
Write Back:
async function updateUser(id, data) {
await cache.set(`user:${id}`, data);
// Async write to DB
queue.push({ type: 'update', id, data });
}
Cache Invalidationβ
| Strategy | Description |
|---|---|
| TTL | Time-based expiration |
| LRU | Least Recently Used eviction |
| Write-through | Update cache on write |
| Write-back | Async write to DB |
π¦ CDNs (Content Delivery Networks)β
Distribute static content globally.
Use cases:
- Static assets (images, CSS, JS)
- Video streaming
- API responses (cacheable)
π Asynchronous Processingβ
Message Queuesβ
Use cases:
- Email sending
- Image processing
- Data export
- Notifications
Pub/Subβ
Use cases:
- Real-time updates
- Event-driven architecture
- Microservices communication
π Monitoring & Metricsβ
Key Metricsβ
// RED Method
Rate: Requests per second
Errors: Failed requests
Duration: Response time
// USE Method
Utilization: % of capacity used
Saturation: How overloaded
Errors: Error rate
Alertingβ
// Alert rules
- Error rate > 1% for 5 min
- P95 latency > 500ms for 5 min
- CPU > 80% for 10 min
- Memory > 90% for 5 min
π Microservices vs Monolithβ
| Aspect | Monolith | Microservices |
|---|---|---|
| Development | Simple | Complex |
| Deployment | All or nothing | Independent |
| Scaling | Scale entire app | Scale specific service |
| Communication | In-process | Network |
| Data | Single database | Per-service DB |
When to split:
- Different scalability needs
- Different teams
- Different deployment cycles
- Isolated failure domains
β Interview Questionsβ
Easyβ
- Design a URL shortener
- Design a rate limiter
- Design a unique ID generator
Mediumβ
- Design a chat system
- Design a news feed
- Design a file storage system
Hardβ
- Design YouTube
- Design Google Search
- Design a distributed database