Back-of-the-Envelope

The Back-of-the-Envelope technique is used to estimate system’s capacity or performance requirements. This rough calculation helps to identify potential bottlenecks in the proposed solution.

Important Numbers

Power of two

PowerApproximate valueFull nameShort name
101 Thousand1 Kilobyte1 KB
201 Million1 Megabyte1 MB
301 Billion1 Gigabyte1 GB
401 Trillion1 Terabyte1 TB
501 Quadrillion1 Petabyte1 PB

Latency numbers

Operation nameTime
L1 cache reference0.5 ns
Branch mispredict5 ns
L2 cache reference7 ns
Mutex lock/unlock100 ns
Main memory reference100 ns
Compress 1K bytes with Zippy10,000 ns = 10 µs
Send 2K bytes over 1 Gbps network20,000 ns = 20 µs
Read 1 MB sequentially from memory250,000 ns = 250 µs
Round trip within the same datacenter500,000 ns = 500 µs
Disk seek10,000,000 ns = 10 ms
Read 1 MB sequentially from the network10,000,000 ns = 10 ms
Read 1 MB sequentially from disk30,000,000 ns = 30 ms
Send packet CA (California) ->Netherlands->CA150,000,000 ns = 150 ms

1 ns (nanosecond) = 10^-9 seconds
1 µs (microsecond) = 10^-6 seconds = 1,000 ns
1 ms (millisecond) = 10^-3 seconds = 1,000 µs = 1,000,000 ns

Conclusions:

  • Memory is fast but the disk is slow
  • Avoid disk seeks if possible
  • Simple compression algorithms are fast
  • Compress data before sending it over the internet if possible
  • Data centers are usually in different regions, and it takes time to send data between them

Availability Numbers

High availability is the ability of the system to be continuously operating for a desirable long period of time.

  • SLA (service level agreement) is an agreement between a service provider and a client that defines the level of uptime the service will deliver.
Availability %Downtime per dayDowntime per weekDowntime per monthDowntime per year
99%14.40 minutes1.68 hours7.31 hours3.65 days
99.99%8.64 seconds1.01 minutes4.38 minutes52.60 minutes
99.999%864.006.05 seconds26.30 seconds5.26 minutes
99.9999%86.40 milliseconds604.802.63 seconds31.56 seconds

Estimation Types

1. Load Estimation

Predicts the expected number of requests per second (RPS), data volume, or user traffic for the system.

1 million requests / day = ~12 requests / second
Seconds in a day = 24h * 60m * 60s = 86400 = ~100,000 seconds

2. Storage Estimation

Estimate the amount of storage required to handle the generated data by the system.

Single Char = 2 bytes
Long/Double = 8 bytes
Average resolution Image = 300 KB
Good resolution Image = 3 MB
Standard videos for streaming = 100 MB per minute of video

3. Bandwidth Estimation

Determines the network bandwidth needed to support the expected traffic and data transfer.

4. Latency Estimation

Predict the response time and latency of the system based on its architecture and components.

5. Resource Estimation

Estimate the number of servers, instances, CPUs, or memory required to handle the load and maintain desired performance levels.

Resources

Top