Auto Scaling in CF based PaaS cloud — Myths Vs reality

4 min readNov 20, 2020

Recently while exploring the best way to scale our CF based application on IBM cloud, we busted one of our belief on the auto scaling part of the CF container.

Belief:

cloud provide you the flexibility to scale your application independently on all the resources

So if this is not true, what is the reality? Here is how the story unfolds.

Recently before onboarding one of our client to our solution, we were performing few of the stress tests. Unfortunately, the result was not promising enough to share with the client. Investigation was on from many fronts starting from tuning SQL, code base, optimizing the API calls etc., However all this was not giving us the desired result. This motivated us to look at those points which we were taking for granted, considering the use PaaS environment. These were nothing but the various resources and their consumption.

1. Memory

2. Compute Power/CPU Cycles

3. Network bandwidth

4. Disk Storage and I/O capacities

We use IBM cloud Foundry based deployments. In practice we were only looking at the memory consumption as that was the only thing which was exposed to us for monitoring and scaling, but we were not seeing anything unusual on that front. Memory consumption was well below the threshold. This triggered a question to look at other type of resources.

As the memory was under control clearly, investigation started on compute and network bandwidth. As there was no direct way to see the stats on them from the IBM cloud interface of the application, we took the help of our monitoring tool dynatrace. Unfortunately, dynatrace showed us the host CPU utilization but not the Diego container and host was showing spikes at the times of our runs.

The question arises that, does these spike in host CPU means container is CPU clogged? how the container CPU is decided in case of CF and what percentage of System CPU get dedicated to a container and how I can scale the CPU? Do I really have any control on scaling compute anyhow?

When we further investigated the CF document, we found that there is no way to scale the CPU, and CF allocate CPU cycle based on the percentage memory allocated to the container. It used the feature called affinity in Unix platform to decide what share of CPU should be given to the process.

Here is the link to CF documentation explaining the same

https://docs.cloudfoundry.org/concepts/container-security.html#cpu

For example, if the host has 40 GB of RAM and your container has 4 GB allocated out of it, it means container is entitled to 10% of the CPU. This is very vague as actual compute will always depend on the memory Vs Core ratio and not what an application needs (application has no control to define its requirements). Let’s say you have 2 systems, one with 40 GB RAM and 8 core CPU while other is 20 GB, 8 core, your container will actually get the more compute on 2nd system compare to first even when the memory allocated to it remain same. This memory to Core ratio may vary from one cloud vendor to another and you may not get same behavior when you move your application from once CF vendor to other.

Now what impact it has on the application? Let’s consider that my application is very much compute intensive but use very low memory. It requires at least full cycle of 2 cores but only 1 GB of RAM to perform well at any time. The system has 8 core and 40 GB RAM, to get the 2 core fully you need to allocate at least 10 GB of RAM. Considering only 1 GB of RAM is really needed, you are actually paying 9 GB RAM cost unnecessarily. Similar scenario happens when you have memory intensive application, as you will end up paying for unnecessary compute.

So do you the belief that user has full control to scale any resource at will is true? I don’t think that true anymore in case of CF.

Similar case happens on the other resource types DISK I/O and network bandwidth. As per my understanding, these are shared among all the containers and nothing is dedicated on these resources, so a noisy neighbor will surely ruin your performance and it’s in vendor control how much bandwidth it attached to a system.

So as we understood that there is no way to control the CPU independently, we have to make sure that our application is written in a way to balance out memory and CPU. Investigation started to reduce the CPU usage, many placed we started saving the result in memory instead of computing at runtime, may algorithm are rewritten to simplify some of the logic, usage of cache is increased etc. This surely started showing the impact as CPU usage reduced and performance improved.

At the end of this exercise few question which arises are

1. Why CF cannot have externalized the CPU requirement to application?

2. Do we have any technological restriction here?

3. Do we have any other way unknown to me to make sure we get higher CPU if needed?

Feel free to provide your feedback if you think my learning has some gap.

Auto Scaling in CF based PaaS cloud — Myths Vs reality

Written by Manoj K Sardana