Last week I received a very interesting question from a fellow blogger asking whether it was possible to "pause" (not suspend) a virtual machine running on ESXi. Today ESXi only supports the suspend operation which saves the current memory state of a virtual machine to disk. With a "pause" operation, the memory state of the virtual machine is not saved to disk, it is still preserved in the physical memory of the ESXi host. The main difference with a "pause" operation is the allocated memory is not released and this allows you to quickly resume a virtual machine almost instantly at the cost of holding onto physical memory.
The use case for this particular request was also quite interesting. The user had an NFS server that housed about 200 virtual machines that needed to be restarted and the goal was to minimize the impact to his virtual machines as much as possible. He opted out from suspending the virtual machines as it would have taken too long and decided on a more creative solution. He filled up the remainder capacity on the datastore which in effect caused all virtual machines to halt their I/O operations. Though not an ideal solution IMHO, this allowed him to restart the NFS server and then run a script for the virtual machines to retry their I/O operation once the NFS server was available again.
Based on the above scenario, he asked if it was possible to "pause" the virtual machines similar to a capability Hyper-V provides today which would have provided him a quicker way to resume the virtual machines. Thinking about the question for a bit, a virtual machine is just a VMX process running in ESXi and I wondered if this process could be paused like a UNIX/Linux process using the "kill" command. Well, it turns out, it can be!
Disclaimer: This is not officially supported by VMware, use at your own risk.
Using the kill command, you can pause the VMX process by sending the STOP signal and to resume the VMX process, you can send the CONT signal. Before getting started, you will need to identify the PID (Process ID) for the virtual machine's VMX process.
There are two methods of identifying the parent VMX PID, the easiest is using the following ESXCLI command:
esxcli vm process list
The PID for the virtual machine will be listed under the "VMX Cartel ID" and in this example I have a virtual machine called vcenter51-1 and on the right I am pinging the system to verify it is up and running. An alternative way of identifying the PID is to use "ps" by running the following command:
ps -c | grep -v grep | grep [vmname]
Note: Make sure you identify the parent PID of the virtual machine if you are using the above command as you will see multiple entries for the different VMX sub-processes.
To pause the VMX process, run the following command (substitute your PID):
kill -STOP [pid]
To resume VMX process, run the following command:
kill -CONT [pid]
Here is a screenshot of pausing and then resuming the virtual machine. You can also see where the pings stop as the virtual machine is paused and then resumed. Once the virtual machine was resumed, it operated exactly where it left off with no issues as far as I can tell.
Note: I have found that if you have VM monitoring enabled, there maybe issues resuming the virtual machine. This should only be done if you have VM monitoring disabled as it may not be properly aware that the VMX process being paused on purpose.
Though it is possible to pause a virtual machine, I am not sure I see too many valid use cases for this feature? Are there are use cases where this feature would actually be beneficial, feel free to leave a comment if you believe there are. For now, this is just another neat "notsupported" trick 😉