Back-of-Envelope Estimation

Metrics

After clarify the scope and limit of a problem in the system design interview, it is always a good idea to make a preliminary estimation of some metrics for our system. This will help us later when we will be focusing on scaling, partitioning, load balancing and caching.

There are several metrics we need to determine at the first place, we could split the metrics into two categories: writing and reading. Following metrics are to be estimated:

And there are some common rules we can apply into our estimation.

Examples

With all the prerequisite knowledge we have, Let practice it with a simple example. A system only has two API, a reading and a writing API, and it has 500M writing requests per month.
Please make an estimation for this system.

Writing QPS = 500M / (3600s * 24h * 30day) ~= 500M / (100K * 30day) ~= 200, Here we use a rough estimation for a common computation: 3600*24 ~= 100k.
Reading QPS = 200*100 ~= 20K.

Let’s assume each request data is 500 bytes. Writing Band Width = 200*500 ~= 100KB/s.
Reading Band Width = 20K * 500 ~= 10MB/s.

Let’s assume the data expires in 5 years.
Data Storage = 500M * 12month * 5 year * 500bytes ~= 30G * 500bytes ~= 15TB

Let’s assume the cache expires in 1 day, and only the hot data is cached.
Cache Memory = 20K * 3600s * 24h * 500bytes * 0.2 ~= 20K * 100K * 100bytes ~= 200GB

Here we have all the critical metric estimation for our system:

Wrap up

This is a common law for us to apply in every back-of-envelope estimate of system design.

Reference

  1. Grokking-System-Design