Host based replication (HBR) is a new feature in the upcoming SRM 5.0 which gives user the ability to replicate VM’s between dissimilar storage. Traditionally, SRM mainly relied on array-based replication to backup and recover virtual machines residing on set of LUN(s). This required all virtual machines to be backed up to be in a set of protected and common LUN(s). With HBR, you now have the ability to target specific VM and their respective VMDK(s) and backup to different storage type at the destination such as local storage, iSCSI/FC LUN or NFS datastores.
Another key difference is HBR does not leverage array replication technology but something analogous to CBT (Change Block Tracking) in which the initial backup is a full copy and all subsequent copies will be differentials. The frequency of this differential backup will be solely based on the configured RPO specified by the user.
Now that we have some background on what HBR and how it relates to the new Site Recovery Manager, let's talk about some of the "limited" automation options. As it stands today there is no publicly exposed SDK from VMware that can be consumed from the various toolkits such as vSphere SDK for Perl, PowerCLI, VI java, etc. To configure a VM to be backed using the new HBR functionality, you will still need to manually go through the vSphere Client wizard by simply right clicking on a VM and selecting "Site Recovery Manager HBR Replication" option.
Once you have the initial configuration set for a given virtual machine, there are some limited functionality that has been exposed through the vimsh interface using vim-cmd. A new "hbrsvc" has now been added which provides some limited options in making configuration and state changes for a given VM under HBR management.
~ # vim-cmd hbrsvc
Commands available under hbrsvc/:
Note: This is probably not officially supported by VMware, please test this in a development or lab environment before using.
If you have used vim-cmd interface, then you should be pretty familiar with how the options work and since this is applicable for a virtual machine, you will need to know the virtual machine's VmId for all the commands.
To retrieve the HBR configuration for a particular VM, you will use the vmreplica.getConfig option:
Here you can see all the configurations that was made through the GUI such as the RPO, quiesce of guestOS and the VMDK(s) configured for replication. You also get some additional information such as the HBR server and the configured port and some important identifiers such as the "VM Replication ID" and "Replication ID". These two identifiers will be very important later on if you want to make use of the other commands.
To retrieve the state of a given VM, you will use the vmreplica.getState option:
This will provide you the current state of replication and progress if the replication is still going on. You will not only get the progress but also the amount transferred data to the destination site.
To retrieve the current replication state of a VM, you will use the vmreplica.queryReplicationState option:
This should be pretty straight forward command to only get details regarding the replication state and the progress both in percentage and amount of data transferred to the destination site.
To pause replication just like you can using the vSphere Client, you will use the vmreplica.pause option:
To resume replication just like you can using the vSphere Client, you will use the vmreplica.resume option:
To disable replication for a VM, you will use the vmreplica.disable option:
Note: Before attempting to disable replication for a VM, it is extremely important to make sure you take down the two important identifiers we had mentioned earlier: "VM Replication ID" and "Replication ID". The reason for this is when you re-enable replication, you will actually need to specify these ID's else your VM will be in a bad state and the only way to recover is using the vSphere Client to re-enable replication.
To re-enable replication for a VM that was disabled, you will use the vmreplica.enable option:
You will need to specify a few parameters such as the VmId, RPO, Destination HBR Server + Port, Enable Quiesce for guestOS, Enable Opportunistic Updates, VM Replication ID and Disk Replication ID which can all be found by running getConfig prior to disabling replication for a given VM
To manually force a replication sync, you will use the vmreplica.sync option:
You also have the ability to change some of the configurations for a VM for replication using the vmreplica.reconfig option:
Currently this is limited to only the RPO, Destination HBR Server + Port and enabling Quiesce guestOS and Opportunistic Updates. In the example above, you can see the RPO window has been updated to 10 minutes and we can confirm this from the vSphere Client. You will notice that the sync will happen ~10 minutes but the reflect RPO is not updated in the SRM interface, this may be a UI bug or the modification is not pushed up to the HBR servers.
Note: Per the vSphere Client and SRM/HBR documentation, the smallest RPO window is 15minutes but I have found that you can actually go smaller but again, use this at your own risk.
I was also interested to see if I could shrink the RPO window even further to say 1 minute and there was no errors and the ESXi tasks actually confirmed the change
Though after making the change and monitoring the next sync, I noticed it did not actually run every minute but anywhere from 6-11 minutes which seems to be the smallest RPO window.
You can also disable replication for a particular VMDK by using the vmreplica.diskDisable option:
To re-enable replication for a particular VMDK, you will use the vmreplica.diskEnable option:
As mentioned earlier, there are no official SDKs from VMware for SRM but the options provided from hbrsvc are from a hidden HBR API found on ESXi 5.0 host, you can see the new "ha-hbr-manager" using the vSphere MOB. Though you can not fully automate the configuration of HBR for a given VM, you do have the ability to automate the reconfiguration or state change for a given VM if you needed to.
Note: I have never placed with SRM prior to vSphere 5, but I also found WSDL files for what looks to be SRM API under the following URLs: http://[SRM-HOST]:8096/sdk/srm and http://[SRM-HOST]:8096/sdk/drService Once could create an SDK bindings using the WSDL files but I will leave that as task for the reader
There is also one additional HBR utility that can be found on the ESXi Shell of ESXi 5.0 which is the hbrfilterctl which provides some information about disks being replicated in HBR.
~ # hbrfilterctl
Ioctl to device is working.
ba : Print the active replication bitmap the the specified disk.
bt : Print the inactive replication bitmap the the specified disk.
pr : Print the disk length, bitmap length and extent for the secified disk.
ts : Extract and transfer a light-weight delta for the specified disk.
li : Returns the File ID, Number of entries, copy index and size of the demand
si : Returns information about the full-sync process
de : Detaches a filter attach for offline replication
log for the specified disk.
fs : Force a full sync of the specified disk.
stats : Returns stats for all (but at most ) groups.
The first two options is pretty verbose as it prints the bitmaps of the specified disk, if you are interested, you can run those to get the output.
Here is an example of running the "pr" option:
Here is an example of running the "li" option:
Here is an example of running the "si" option:
The last option "stats" is probably the only real useful command for users at least which provides the status of replication and by specifying a number, it limits the output. Here is an example
Nice post William, is it possible to set the HBR without SRM but only with vim-cmd commands ?
As mentioned in the beginning of the post, HBR is part of SRM and you will need to have SRM available to even use HBR functionality. You also need to perform the initial configuration via SRM UI before even attempting vim-cmd's, else they won't do anything but give you errors.
Great stuff as always William, thanks for the great content. Quick question, I was wondering if you've heard or found anything about a new APIs for SRM 5. I use the current APIs to initiate Recovery Plans and the like using the community provided PowerShell cmdlts on MSDN of all places; and was just wondering if any new functionalities are exposed.
There's still no "official" API as stated in the blog post, but there have been some updates in SRM5, I'll provide a separate post with the details
Ân Nguyễn says
Thanks for this article. I was able to use the option vim-cmd vmsvc/getallvms and then use vim-cmd hbrsvc/vmreplica.disable vmid to disable the failed replications and then re-enable it using vSphere Client. Without this command, I was having MISSING PLACEHOLDER error every time I tried to re-enable a failed replication setup.
David Espejo says
You absolutely made my day with this post. Thanks
VG is definitely my favorite blog on vmware, so much useful information - thanks !
These commands are useful, however vsphere replication seems to have a mind of it's own.
I've set RPOs for 24h, however it kicks off at pretty random times of the day - sometimes during production hours and other times during the evening. These commands are great for getting status information during the sync, but how do you get information of the RPO time and how can you script the pausing and resuming of these ?
@Ân Nguyễn: Thanks for the tip for the MISSING PLACEHOLDER issue. I encountered that on a VM we tried to migrate but had to roll back because the network configuration wasn't set properly and our outage window was nearly expired before our engineer could figure out how to make the changes in our HP bladesystem environment.
Great post but I am having issues running the pause and resume commands when ssh'd into a host managed by a vcenter as although an event occurs - pause/resume replication of virtual machine (initiated by root) - they don't appear to work. If I manual pause/resume the job in vcenter an additional event occurs saying: pause/resume virtual machine replication (recent tasks on host say initiated by vpxuser); and this then works as it should. So is there an additional command I need to run with vsphere replication 5.5?
William Lam says
Since these commands are not officially supported, it's behavior is YMMV. I've not played with these in more details, so I can't comment on why you're seeing this behavior.
Great post! Is it possible to automate also the first sync with the vmreplica.enable command?
Federico Fortini says
Hello, very useful list of command, especially the "configChange". I've successfully created a script to change RPO massively, because it's a very tedious operation to in web client. Unfortunately the changed operated directly from CLI are not reflected on web client, because the VR database are not updated consequently. You have to do by yourself.
Thanks for the article.
Has anyone seen issues with GUI not getting updated?
I ran vim-cmd hbrsvc/vmreplica.pause and it was successful.
~ # vim-cmd hbrsvc/vmreplica.queryReplicationState
Querying VM running replication state:
Current replication state:
But GUI never changed, the replication status remained "success" I waited and also did refresh but no luck