java kubernetes memory memory-management

Why should we set -XX:InitialRAMPercentage and -XX:MaxRAMPercentage to the same value for cloud environment?

I've read multiple articles about best practices for setting the {Min/Max/Initial}RAMPercentage values for java applications in the cloud environment. The purpose of Min and Max RAM Percentage flags is clear for me.

On the other side I can't neither understand nor find the reason behind recommendations (1st, 2nd and 3rd) to set the -XX:InitialRAMPercentage and -XX:MaxRAMPercentage to the same values.

Therefore I would like to understand:

why should I set the InitialRAMPercentage at all? My assumption is that it will be scaled up automatically during the application lifetime (because default value lies around 1.5 percent) and not cause any problems by not setting it.
why should I set the InitialRAMPercentage directly to the same value as MaxRAMPercentage?

Solution

It is not a hard and fast rule, and whether you should do this depends a lot on your application.

Setting them the same will mean that your application will always use the same heap size, which means the memory use will (ignoring fluctuations in native memory usage) be stable. This means that you can generally configure your pod quotas (i.e. limits.memory and requests.memory) easily, and thus have stable costs, and this also reduces the chance of your application getting killed if the node needs to make resources available to another pod (which can happen if you exceed your (memory) requests, even if you're still within your limits)

On the other hand, if your application has a varying workload and thus varying memory needs, setting them differently may reduce costs if you use a garbage collector that returns memory to the OS. For example, at a previous job, we were able to save quite some money for one application by configuring the requests.memory and -XX:InitialRAMPercentage at a certain minimum need, while setting limits.memory and -XX:MaxRAMPercentage a lot higher for bursty processing needs. The downside of this was that this increased the chance of our pod(s) being chosen to be rescheduled to a different node if memory pressure increased, but in our situation that was an acceptable trade-off.

To be clear, whether this actually saves money also depends on your actual Kubernetes cluster configuration and whether or not nodes are spun up and down depending on total load, however in our case, we saved costs on our team budget because we were virtually billed for the usage of our applications, and given the size of our company and the dynamic scaling up and down of the number of pods and nodes, it probably also saved money on the company level.

In addition, setting -XX:InitialRAMPercentage and -XX:MaxRAMPercentage may take away some flexibility in the heuristic configuration of the garbage collector, which may mean you have to do more explicit tuning yourself to get the best performance.

Also, -XX:InitialRAMPercentage specifies the percentage of memory that the heap will be when you start your application, so it cannot "be scaled up automatically during the application lifetime". Its default is 1.562500%, which is quite low. If your application needs a lot of memory, then setting it at its actual needs avoids a number of full GC cycles during the startup, and thus setting this correctly reduces the start up time of your application.