What’s better — Horizontal or Vertical Scaling for effective resources management

Auto scaling generally comes with its 2 flavors- Horizontal and vertical scaling. While horizontal means increasing the number of application instances and load balancing the request, vertical mean providing more resources to the application for it to more effectively, though which one to choose is entirely depend on your application resource requirements.

In general, following are the resources which any application consumes.

1. RAM/Memory

2. CPU/Compute

3. Disk / Disk I/O

4. Network/ bandwidth

In most cases developers concentrate on first 2, though other two get ignored and it’s the other two which actually make a lot of difference sometimes and help us choose between the two flavor of the scaling.

Let’s assume that your system has currently 4 GB RAM and is going slow. More throughput is expected and you are looking to scale up your system to support the same and you are considering the following three configurations.

1. Single instance with 8 GB memory

2. 2 instances of 4 GB RAM

3. 4 instances of 2 GB RAM each

In all the above cases, memory is same. So what do you think, which configuration give you the better performance?

Answer to it depend on your application algorithm. From resource context, application can be of any of these type

1. Memory intensive

2. Compute intensive

3. IO intensive

4. Network communication intensive

5. Combination of any of the above

Now let’s try to understand what happens when we scale application

Horizontal Scaling — So what happened to these resources when we increase the number of the instances. Let’s assume that an application required around a GB of memory for its runtime to load and you are increasing the number of instance from 1 to 4 (going from 1st configuration to 4th from our list above, keeping the overall memory same as 8 GB)

- Total runtime requirement changed from 1 GB (for single instance) to 4 GB (now to support 4 instances, 1 GB per instance)

- A single instance configuration has 7 GB of memory to support use requests. With scaled version we have 1 GB per instance (total 4) to support user request (it’s not exactly same as 4 GB, think through one request require 1.5 GB data to come in memory)

- You may or may not get more CPU cycles depending on how the CPU allocation happens. In case of CF cloud, CPU allocation happens in percentage of memory of the container. So it means overall CPU allocation for user requests will actually reduce (same logic as of memory as you now have to support 4 runtimes CPU requirement Vs 1 runtime CPU requirements)

- DISK I/O will actually improve assuming all the 4 nodes are now on different server.

- Network I/O will also improve assuming that all the nodes are now on different server.

So what horizontal scaling is giving us more DISK I/O and Network I/O at a cost of losing some memory and CPU (with the assumption that we are keeping the same memory allocation). Though if you have no problem allocation more memory, this can be compensated to get more I/O and network bandwidth.

If I talk about vertical scaling, this is what will happen

- Total application runtime requirement will be same, so you get more memory for serving the user requests keeping the same runtime requirements.

- CPU requirements for runtime also do not change

- DISK I/O will be same so now you may need to support more user request with less DISK I/O per request available to you.

- Network I/O will be same so now you may need to support more user request with less network I/O per request available to you.

So Summary here is

1. If your application is I/O or Network intensive, you may consider doing a horizontal scaling.

2. If your application is memory intensive/ compute intensive, you may be considering doing vertical scaling.

NOTE: the above discussion has been limited to considering resources cost and optimization when deciding the scaling, though there are many more consideration like session management, threading and others which should be considered to decide on it

A software Engineer, SRE Lead, DB administrator and performance