How do you do back-of-the-envelope capacity estimation in a system design interview?

Start from daily active users (DAU) and writes per user per day to get writes/second (divide by 86,400 seconds in a day). Multiply by a read:write ratio for reads/second, and by a peak factor (usually 2x) for peak QPS. For storage, multiply writes/day by average item size and by the retention period. The calculator on this page does all of this as you type.

What latency numbers should I memorize for system design?

The orders of magnitude: L1 cache ~1 ns, main memory ~100 ns, SSD random read ~16 microseconds, same-datacenter round trip ~500 microseconds, reading 1 MB from SSD ~1 ms, a disk seek ~2-10 ms, and a cross-continent network round trip ~150 ms. Memory is roughly 100,000x faster than disk, and staying in one datacenter is roughly 100x faster than crossing the internet.

When should I use SQL vs NoSQL in a system design interview?

Use a SQL/relational database when you need ACID transactions, complex joins, and strong consistency (payments, orders). Use NoSQL when you need horizontal scale, flexible schema, and high write throughput, and can tolerate eventual consistency (feeds, logs, sessions). State the trade-off out loud — interviewers score the reasoning, not the choice.

Is this system design cheat sheet free?

Yes — free, no signup, works on any device. It is published by CoPilot Interview as a study reference for engineers preparing for system design interviews.

System Design Cheat Sheet: Capacity Calculator + Component Reference (Free)

Back-of-the-envelope capacity calculator

Estimating scale is the part candidates fumble. Type your assumptions; the numbers update live. Round them in the interview and state them confidently.

Daily active users (DAU)

Writes per user per day

Reads per write (read:write ratio)

Average item size (KB)

Peak factor (× average)

Data retention (years)

Write QPS (avg)—

Read QPS (avg)—

Peak read QPS—

New data / day—

Storage (over retention)—

Read bandwidth—

Latency numbers every engineer should know

Approximate orders of magnitude. The lesson: memory is ~100,000× faster than disk, and one datacenter is ~100× faster than crossing the internet.

Operation	Latency	Relative
L1 cache reference	~1 ns	baseline
Branch mispredict	~3 ns
L2 cache reference	~4 ns	4× L1
Mutex lock/unlock	~17 ns
Main memory reference	~100 ns	100× L1
Compress 1 KB	~2 µs
Send 1 KB over 1 Gbps	~10 µs
SSD random read	~16 µs	160× memory
Round trip in same datacenter	~500 µs
Read 1 MB sequentially from SSD	~1 ms
Disk seek (HDD)	~2–10 ms
Round trip CA ↔ Europe	~150 ms	1.5M× L1

Core components — when to use each

Search by need (e.g. “cache,” “queue,” “search”). Every answer should name the component and the trade-off.

Load Balancer

Use when: spreading traffic across many servers; need failover and health checks.

Trade-off: another hop + a thing to make highly available itself. L4 (fast) vs L7 (content-aware routing).

Cache (Redis / Memcached)

Use when: reads dominate and data is hot; you need sub-millisecond reads.

Trade-off: cache invalidation + staleness. Pick a policy (LRU) and a write strategy (write-through vs write-back).

CDN

Use when: serving static assets/media to a global audience; cut latency at the edge.

Trade-off: cache invalidation across edges; cost. Great for read-heavy static content, not dynamic per-user data.

SQL / Relational DB

Use when: you need ACID transactions, joins, and strong consistency (payments, orders).

Trade-off: harder to scale writes horizontally. Reach for read replicas and sharding when it grows.

NoSQL DB

Use when: massive scale, flexible schema, high write throughput; eventual consistency is OK (feeds, logs, sessions).

Trade-off: weaker consistency and limited joins. Model the data around your access patterns.

Message Queue (Kafka)

Use when: decoupling producers from consumers, smoothing spikes, async processing, event streaming.

Trade-off: adds latency + operational complexity; you must handle duplicates (at-least-once delivery).

Object Storage (S3)

Use when: storing large blobs — images, video, backups — durably and cheaply.

Trade-off: high latency vs a DB; not for low-latency random access. Store the blob in S3, the metadata in a DB.

Search (Elasticsearch)

Use when: full-text search, autocomplete, or relevance ranking over lots of documents.

Trade-off: it’s a secondary index to keep in sync; not your source of truth.

API Gateway

Use when: a single entry point for auth, rate limiting, and routing to microservices.

Trade-off: a potential bottleneck and single point of failure — make it horizontally scalable.

Rate Limiter

Use when: protecting a service from abuse or ensuring fair use (token bucket is the usual answer).

Trade-off: where to store counters at scale (Redis) and how to handle distributed limits.

Numbers worth memorizing

2¹⁰	~1 Thousand (1 KB)
2²⁰	~1 Million (1 MB)
2³⁰	~1 Billion (1 GB)
2⁴⁰	~1 Trillion (1 TB)
Seconds in a day	~86,400 (round to 100K)
Seconds in a month	~2.5 Million
Char (ASCII)	1 byte · UTF-8 up to 4

FAQ

How do you estimate capacity in a system design interview?

From DAU and writes/user/day → writes/sec (divide by 86,400). × read:write ratio → reads/sec. × peak factor (~2) → peak QPS. For storage: writes/day × item size × retention. The calculator above does it live.

When do I use SQL vs NoSQL?

SQL for ACID, joins, strong consistency (payments). NoSQL for horizontal scale, flexible schema, high write throughput with eventual consistency (feeds, logs). Say the trade-off out loud — that’s what’s scored.

Is this cheat sheet free to share?

Yes — free, no signup, link or bookmark it. Published by CoPilot Interview as a study reference.

From the whiteboard to the live round

Reciting the components is one thing; structuring a clear answer under pressure is another. CoPilot Interview is a desktop AI assistant that helps you organize a system-design answer in real time during the actual interview — with a permanent free tier.

See how it works

System Design Cheat Sheet