Understanding System Design and Horizontal vs Vertical Scaling

4 min readFeb 26, 2023

We frequently come across the term “System Design” while preparing for interviews. We’ll now learn what it is and why it’s important to understand.

System design is the process of defining the elements of a system such as the architecture, modules and components, the different interfaces of those components and the data that goes through that system.

To put it simply, system design is similar to putting together a puzzle. You have a lot of small pieces that must all fit together perfectly to form the final image. In the same way, a system is made up of many different parts that need to work together to make it function properly.

Assume you have a program on your computer that accepts input and outputs the desired result. What do you do when a significant number of people approach you and request this program? You build an API and expose the endpoint. Now anyone can send a request to your computer and get the response. However, the setup may require the use of a desktop database, the configuration of API endpoints, consideration of load balance, and consideration of what to do in the event of a power outage. Because many people depend on your computer for this program, you cannot afford to have your computer down. How can you tackle these things?

You can host your services on the cloud. You must use a reliable hosting service to ensure the program is accessible to users at all times.

If your computer program grows in popularity and more people try to access it, and you only host it on a single, insufficiently powerful server, it will take more time to handle all requests. This is the time to scale your systems.

There are two ways to increase the scale of your system:
1. Purchase a more robust computer.
2. Acquire additional computers.

This is known as Scalability.

You go ahead and purchase a more robust computer with higher RAM, hard disk space and a powerful processor.

You’ve just acquired a more powerful machine. By throwing more money at the problem, we can handle more requests. When you purchase a larger machine, your computer will be larger and thus will be able to process the request faster. You are increasing the size of your computer vertically; this is known as vertical Scaling.

You and your users are both happy now. Fortunately, or unfortunately, your program’s popularity grows, and the number of requests has increased from 1L Req/sec to 5L Req/sec. Your newly purchased server is struggling to handle all requests, and you are considering scaling it up again! You have the funds and bandwidth to scale it up, but how long will you consider getting a larger server? There is a limit up to which you can scale.

You begin to wonder if it is reasonable to handle so many requests with so much data on a single machine. You think of all the things that could happen to that machine and all your data is gone.

This is what is known as Single point of failure (SPOF). The failure of a single component can result in the failure of the entire system.

Then you consider scaling it out by purchasing multiple cheaper machines to avoid a SPOF. You soon have 10 of these servers to handle the requests.

The servers you purchased can handle approximately 60k simultaneous requests, and you will require approximately 8 of these servers to handle 5L Req/sec. You have ten of these servers and are confident that you can handle the current request load comfortably.

You’ve just tackled all known problems including SPOF by buying multiple machines. When you scale by adding more machines to the pool of resources, it is known as Horizontal Scaling.

By scaling out this way, you can be sure that even if your program is no longer useful and you no longer require the servers, you can still sell the extra servers. Similarly, if your program receives even more requests, you can add more servers and keep adding them as needed. Even if one of the servers fails, another can take over. There is no longer a single point of failure. You have introduced Resilience in this manner.

System resilience is the system’s ability to withstand a major disruption and recover in a reasonable amount of time. If one server fails, another server in the pool can begin serving traffic. There is no single point of failure.

I hope you now have a basic understanding of system design, as well as the significance of horizontal and vertical scaling and how to scale according to business needs.

Thank you for reading and take care!

Understanding System Design and Horizontal vs Vertical Scaling

Written by Venkatesha Matam