Host based replication (HBR) is a new feature in the upcoming SRM 5.0 which gives user the ability to replicate VM’s between dissimilar storage. Traditionally, SRM mainly relied on array-based replication to backup and recover virtual machines residing on set of LUN(s). This required all virtual machines to be backed up to be in a set of protected and common LUN(s). With HBR, you now have the ability to target specific VM and their respective VMDK(s) and backup to different storage type at the destination such as local storage, iSCSI/FC LUN or NFS datastores.
Another key difference is HBR does not leverage array replication technology but something analogous to CBT (Change Block Tracking) in which the initial backup is a full copy and all subsequent copies will be differentials. The frequency of this differential backup will be solely based on the configured RPO specified by the user.
Now that we have some background on what HBR and how it relates to the new Site Recovery Manager, let's talk about some of the "limited" automation options. As it stands today there is no publicly exposed SDK from VMware that can be consumed from the various toolkits such as vSphere SDK for Perl, PowerCLI, VI java, etc. To configure a VM to be backed using the new HBR functionality, you will still need to manually go through the vSphere Client wizard by simply right clicking on a VM and selecting "Site Recovery Manager HBR Replication" option.
Once you have the initial configuration set for a given virtual machine, there are some limited functionality that has been exposed through the vimsh interface using vim-cmd. A new "hbrsvc" has now been added which provides some limited options in making configuration and state changes for a given VM under HBR management.
~ # vim-cmd hbrsvc
Commands available under hbrsvc/:
vmreplica.abort vmreplica.pause
vmreplica.create vmreplica.queryReplicationState
vmreplica.disable vmreplica.reconfig
vmreplica.diskDisable vmreplica.resume
vmreplica.diskEnable vmreplica.startOfflineInstance
vmreplica.enable vmreplica.stopOfflineInstance
vmreplica.getConfig vmreplica.sync
vmreplica.getState
Note: This is probably not officially supported by VMware, please test this in a development or lab environment before using.
If you have used vim-cmd interface, then you should be pretty familiar with how the options work and since this is applicable for a virtual machine, you will need to know the virtual machine's VmId for all the commands.
To retrieve the HBR configuration for a particular VM, you will use the vmreplica.getConfig option:
Here you can see all the configurations that was made through the GUI such as the RPO, quiesce of guestOS and the VMDK(s) configured for replication. You also get some additional information such as the HBR server and the configured port and some important identifiers such as the "VM Replication ID" and "Replication ID". These two identifiers will be very important later on if you want to make use of the other commands.
To retrieve the state of a given VM, you will use the vmreplica.getState option:
This will provide you the current state of replication and progress if the replication is still going on. You will not only get the progress but also the amount transferred data to the destination site.
To retrieve the current replication state of a VM, you will use the vmreplica.queryReplicationState option:
This should be pretty straight forward command to only get details regarding the replication state and the progress both in percentage and amount of data transferred to the destination site.
To pause replication just like you can using the vSphere Client, you will use the vmreplica.pause option:
To resume replication just like you can using the vSphere Client, you will use the vmreplica.resume option:
To disable replication for a VM, you will use the vmreplica.disable option:
Note: Before attempting to disable replication for a VM, it is extremely important to make sure you take down the two important identifiers we had mentioned earlier: "VM Replication ID" and "Replication ID". The reason for this is when you re-enable replication, you will actually need to specify these ID's else your VM will be in a bad state and the only way to recover is using the vSphere Client to re-enable replication.
To re-enable replication for a VM that was disabled, you will use the vmreplica.enable option:
You will need to specify a few parameters such as the VmId, RPO, Destination HBR Server + Port, Enable Quiesce for guestOS, Enable Opportunistic Updates, VM Replication ID and Disk Replication ID which can all be found by running getConfig prior to disabling replication for a given VM
To manually force a replication sync, you will use the vmreplica.sync option:
You also have the ability to change some of the configurations for a VM for replication using the vmreplica.reconfig option:
Currently this is limited to only the RPO, Destination HBR Server + Port and enabling Quiesce guestOS and Opportunistic Updates. In the example above, you can see the RPO window has been updated to 10 minutes and we can confirm this from the vSphere Client. You will notice that the sync will happen ~10 minutes but the reflect RPO is not updated in the SRM interface, this may be a UI bug or the modification is not pushed up to the HBR servers.
Note: Per the vSphere Client and SRM/HBR documentation, the smallest RPO window is 15minutes but I have found that you can actually go smaller but again, use this at your own risk.
I was also interested to see if I could shrink the RPO window even further to say 1 minute and there was no errors and the ESXi tasks actually confirmed the change
Though after making the change and monitoring the next sync, I noticed it did not actually run every minute but anywhere from 6-11 minutes which seems to be the smallest RPO window.
You can also disable replication for a particular VMDK by using the vmreplica.diskDisable option:
To re-enable replication for a particular VMDK, you will use the vmreplica.diskEnable option:
As mentioned earlier, there are no official SDKs from VMware for SRM but the options provided from hbrsvc are from a hidden HBR API found on ESXi 5.0 host, you can see the new "ha-hbr-manager" using the vSphere MOB. Though you can not fully automate the configuration of HBR for a given VM, you do have the ability to automate the reconfiguration or state change for a given VM if you needed to.
Note: I have never placed with SRM prior to vSphere 5, but I also found WSDL files for what looks to be SRM API under the following URLs: http://[SRM-HOST]:8096/sdk/srm and http://[SRM-HOST]:8096/sdk/drService Once could create an SDK bindings using the WSDL files but I will leave that as task for the reader
There is also one additional HBR utility that can be found on the ESXi Shell of ESXi 5.0 which is the hbrfilterctl which provides some information about disks being replicated in HBR.
~ # hbrfilterctl
Ioctl to device is working.
Usage: hbrfilterctl
Commands:
ba : Print the active replication bitmap the the specified disk.
bt : Print the inactive replication bitmap the the specified disk.
pr : Print the disk length, bitmap length and extent for the secified disk.
ts []: Extract and transfer a light-weight delta for the specified disk.
li : Returns the File ID, Number of entries, copy index and size of the demand
si : Returns information about the full-sync process
de : Detaches a filter attach for offline replication
log for the specified disk.
fs : Force a full sync of the specified disk.
stats : Returns stats for all (but at most ) groups.
The first two options is pretty verbose as it prints the bitmaps of the specified disk, if you are interested, you can run those to get the output.
Here is an example of running the "pr" option:
Here is an example of running the "li" option:
Here is an example of running the "si" option:
The last option "stats" is probably the only real useful command for users at least which provides the status of replication and by specifying a number, it limits the output. Here is an example