What Does Memory Ballooning Mean?
Memory ballooning is a memory management feature used in most virtualization platforms which allows a host system to artificially enlarge its pool of memory by taking advantage or reclaiming unused memory previously allocated to various virtual machines.
This is achieved through a balloon driver which is installed on the guest operating system which the hypervisor communicates with when it needs to reclaim memory through ballooning.
Techopedia Explains Memory Ballooning
Through memory ballooning, a host server can reclaim unused memory from other less busy virtual machines and reassign it to ones that require it more. Theoretically, a server with 32GB of memory might be able to support a combined virtual machine memory capacity allocation of 64GB simply because all of those virtual machines will not be using the maximum amount of memory they have been assigned at the same time.
The balloon driver in each guest operating system keeps track of the excess memory of each VM and when the hypervisor calls for a memory reclamation through ballooning, the balloon driver in the VM pins down a specific amount of memory so that the VM cannot consume it, and then the hypervisor reclaims that pinned memory for reallocation. If there is a scarcity of unused memory then a memory swap might be initiated in order to fulfill the balloon quota. If this happens too much, there would be a lot of I/O overhead between the various VMs that are doing memory swapping with the disk and might adversely affect overall performance of the virtual system.
The obvious benefit is that a host can support more VMs provided that most of them will not consume their memory allocation most of the time. But in a system where most of the VMs are busy and consume most of their allocated memory, then ballooning might cause performance degradation. This just highlights the importance of memory capacity for any computer system.