System Design

Back of the Envelope Calculations | System Design Essentials

Back of the Envelope Calculations

In a system design interview, your interviewer will give you an open-ended problem like "Design Twitter." Before you start drawing architecture diagrams, you need to understand the scale of the system.

How much traffic will it get? How much storage will it need in 5 years? How much network bandwidth is required?

This is where Back of the Envelope calculations come in. These are quick, rough estimates used to gauge the system's capacity requirements and justify your architectural decisions.


Why Do We Need Them?

  1. To prove your design scales: You can't suggest a single generic SQL database if you calculate that the system will ingest 50,000 writes per second. The math proves you need a distributed or NoSQL solution with horizontal sharding.
  2. To catch bottlenecks early: Estimating helps you identify whether your system is CPU-bound, memory-bound, or network-bound.
  3. To show seniority: Senior engineers don't guess; they estimate based on hardware hardware limitations and expected load.

Core Metrics to Estimate

When doing back of the envelope math in an interview, focus on these three primary areas:

1. Traffic Estimates (QPS)

Queries Per Second (QPS) determines how many requests your web servers and databases need to handle.

  • Establish the Daily Active Users (DAU).
  • Estimate how many times each user triggers an action per day.
  • Divide the total daily requests by 86,400 (the number of seconds in a day) roughly rounding to 100,000 for easier mental math.
Example: Twitter Read QPS
- 300 Million DAU
- Each user reads 100 tweets per day
- Total reads/day = 30 Billion
- Read QPS = 30,000,000,000 / 100,000 = 300,000 QPS
- Peak QPS (Estimated 2x to 5x) = ~1,000,000 QPS

2. Storage Estimates

How much disk space will the database and media storage need over the next 5 years?

  • Estimate the size of a single object (e.g., a text tweet is ~1KB, a profile image is ~100KB, a video is ~50MB).
  • Multiply by the number of objects created per day.
  • Multiply by 365 days and 5 years to get the 5-year capacity requirement.
Example: Twitter Media Storage
- 300 Million DAU
- 10% of users post a photo per day (30 Million photos/day)
- Average photo size = 1 MB
- Daily Storage = 30,000,000 MB = 30 TB / day
- 5-Year Storage = 30 TB * 365 * 5 = ~55 PB (Petabytes)

3. Bandwidth Estimates

How much data is flowing into and out of your system per second?

  • Ingress: Incoming data. (e.g., users uploading photos).
  • Egress: Outgoing data. (e.g., users downloading/viewing photos).
  • This metric is crucial for determining costs and sizing your network infrastructure, load balancers, and CDN caching strategies.

Numbers Every Engineer Should Memorize

To do these calculations quickly in an interview without a calculator, you must commit standard latency numbers and byte conversions to memory:

Power of 2 vs Power of 10

Note: For rough estimations, software engineers generally round 1024 to 1000 to make mental math easier.

Power of 10Power of 2NameExact Value
1 Thousand2^101 Kilobyte (KB)1,024 Bytes
1 Million2^201 Megabyte (MB)1,024 KB
1 Billion2^301 Gigabyte (GB)1,024 MB
1 Trillion2^401 Terabyte (TB)1,024 GB
1 Quadrillion2^501 Petabyte (PB)1,024 TB

Latency Numbers (Jeff Dean's Numbers)

When determining if an architecture is fast enough, use these standard industry benchmarks:

// Simulated Latency Lookup Table
public class SystemLatency {
    public static final String L1_CACHE_REF  = "0.5 ns";
    public static final String MUTEX_LOCK_UNLOCK = "100 ns";
    public static final String MAIN_MEMORY_READ  = "100 ns";
    public static final String ZIPPED_1K_BYTES   = "10,000 ns (10 us)";
    public static final String SSD_RANDOM_READ   = "150,000 ns (150 us)";
    public static final String READ_1MB_SEQ_MEM  = "250,000 ns (250 us)";
    public static final String READ_1MB_SEQ_DISK = "20,000,000 ns (20 ms)";
    public static final String PACKET_CA_TO_NL   = "150,000,000 ns (150 ms)";
}

Check out the video above to see a live demonstration of calculating these metrics for a real-world system design problem!