As the adoption of vSphere Content Library continues to grow, I am seeing more questions from our field and customers around content distribution. In case you did not know, vSphere Content Library (CL as I will be refering to it going forward) has its own built-in native replication mechanism which allows customers to easily publish and subscribe to libraries from either within a single vCenter Server instance or even between two completely different vCenter Servers (regardless of deployment topology and/or SSO Domain configurations).
Content distribution or replication is handled by CL which is a service within the vCenter Server. If content is being replicated from within a single vCenter Server and the ESXi hosts can communicate with each other, then direct host to host transfer is used, also referred to as Network File Copy (NFC), rather than going through vCenter Server. When content is transfered between two vCenter Servers, then the data travels through vCenter Server using standard HTTPS (443) by default. In the latter scenario, if you have configured Enhanced Linked Mode for your vCenter Servers, then NFC will be used unless ESXi hosts can not communicate with each other than, it will automatically fall back to the default HTTPS which is pretty cool.
One thing that may not be very well known is that customers actually have a choice in how their CL content is replicated. In addition to native replication which currently does not support incremental/delta updates, meaning all file transfers are full copies, CL can also support external replication. In fact, many customers today already have existing methods for efficiently replicating large amounts of data across multiple datacenters whether that is replication built into their storage arrays, network appliances or some other means. For these customers, you can still benefit from CL while continue to take advantage of your existing methods of replication.
So how does it work? When you create a published library, there is an associated set of metadata that describes the content itself and its location within the underlying storage system. This metadata is stored internally within CL and is used to communicate to other subscriber CLs on what content to synchronize and make available in their respective vCenter Server. This all happens transparently between a publisher and subscriber CL without any user involvement as one would expect. If you just copied the underlying CL files without this additional metadata, when you go and subscribe to the published CL, it will have no idea about these existing files and simply download the content again.
To prevent this "double" copy for externally replicated CLs, there is actually an advanced library setting called persist_json_enabled that can only be configured when using the Content Library REST API to persist and store the metadata that we had talked about earlier. With both the content and the metadata files being available during a subscriber CL creation, we are effectively performing a zero copy of the data since we already have the content and can make available for use immediately. To demonstrate this and some other useful CL APIs, I have updated my Content Library PowerCLI Module to include some new additional functions to aide in setting up an externally replicated CL.
Lets now make this more concrete by walking through an example. Below is a screenshot of VC1 (vcenter65-1) and it has a published CL (VC1-ContentLibrary) which is stored on an datastore (iSCSI-01) and I will configure it to support external replication so that I can have the exact same content residing on VC2 (vcenter65-3) with a subscribed CL (VC2-ContentLibrary) without having the CLs transfer any data between the two. You will need to have the latest PowerCLI release installed if you wish to make use of my CL PowerCLI module (Content Library API can be accessed through variety of vSphere Automation SDKs).
As mentioned earlier, the persist_json_enabled property is only available when using the CL REST API, so I have enhanced my Get-ContentLibrary function to include a bunch more useful information including this property (JSONPersistence) as shown in the screenshot below.
If we now login to an ESXi host which has access to the underlying storage of the CL, we can see the CL layout which includes unique IDs for each item that is uploaded to the CL and if we go inside one of the directories, we can see the actual file items as you would expect.
To enable the persistence of the JSON metadata file, you can do this when creating a new CL by using the New-LocalContentLibrary function and passing in the -JSONPersistence $true option or you can update an existing CL that you had already created by using the Set-ContentLibrary function. To do so, first login to the CIS API endpoint by using the Connect-CiSServer cmdlet and then run one of the following commands:
Here is an example of enabling the setting:
Set-ContentLibrary -LibraryName VC1-ContentLibrary -JSONPersistenceEnabled
Here is an example of disabling the setting:
Set-ContentLibrary -LibraryName VC1-ContentLibrary -JSONPersistenceDisabled
Note: Enabling/Disabling of JSON persistence is merely storing or deleting the metadata files. It has no impact to CL usage and can be done while CL is in use. It should also be noted that a CL configured with JSON persistence can continue to work with standard subscribed CLs, there is no impact to making the CL available through the traditional method which is also really nice.
If we now take a look at our storage system again, you should see several JSON files that have now been created which reflects the current CL metadata. These files will automatically be updated based on changes made within CL itself.
At this point, you are now ready to "replicate" your CL to your remote location. As mentioned earlier, this can be done through a variety of tools such as native array replication or even something as a simple as rsync or SCP in my case for demonstration purposes. When duplicating the CL content directory, you can rename the top level directory name from contentlib-[UUID] to anything you want, but make sure to leave all other directory and files names alone. Once you have completed replicating the CL, you can disconnect from your CIS API endpoint of your source vCenter Server and connect to your destination CIS API endpoint using the Connect-CiSServer cmdlet again. You will also need to connect to the vCenter Server using Connect-VIServer cmdlet, this is needed to perform an ID lookup of the datastore you wish to create the new CL on.
In my environment, I have copied the content to another datastore (iSCSI-02) which you can see from the screenshot below. This datastore is also being managed by a different vCenter Server (vcenter65-3) than the publisher CL and I have also renamed the top level replicated CL to myExtReplicatedContentLibrary. If you decide to rename the top level directory, please make a note of this as you will need this for later when creating your new subscriber CL.
Next, to consume our externally replicated CL, we need to create a new subscriber CL by using the new New-ExtReplicatedContentLibrary function using the following example:
New-ExtReplicatedContentLibrary -LibraryName VC2-ContentLibrary -DatastoreName iSCSI-02 -SubscribeLibraryName myExtReplicatedContentLibrary
The function is pretty straight forward, you simply provide the name of the new CL, the datastore in which the CL has been replicated to and the directory name of the subscribed library you had replicated to earlier. If everything was successful, you should now have a new CL that is subscribing to the replicated content that you had copied over earlier. You can kind of think kind of a loop back mount and no data is actually being sent across the wire between the two vCenter Servers. Pretty cool, huh!?
If we now run the Get-ContentLibrary function, we should see our new subscribed CL and you will notice the subscribed URL is actually a datastore reference rather than a URL, which is what we expect for consuming an externally replicated CL.
If we login to our vSphere Web/H5 Client, we can also verify that we see the new CL and the content that was externally replicated and you can start consuming this CL and its content immediately!
As you can see the process is fairly straight forward and customers can now take advantage of their existing replication tools to easily distribute CL content across multiple datacenters.
Note1: Currently there is not a way to distinguish between a regular subscribe CL versus an externally replicated CL other than the subscription URL with prefix URI of "ds://". This is only visible using either the CL REST API or vSphere Web (Flex) Client as the H5 Client does not currently display the subscription URL. Hopefully this will be updated in a future H5 update to include this useful bit of information on the source of the subscription URL.
Note2: When deleting an externally replicated CL using either the UI or API, the CL will be removed but the actual content on the filesystem will still persist. To delete the files, you will need to go to the datastore view and then delete the top level directory of the CL.
Note3: In case it was not apparent, when consuming an externally replicated CL, it is the customers responsibility to ensure that both the content and the JSON metadata files are synchronize on some perodic schedule to ensure that the subscribed CLs will pick up any changes made from the source published CL.
Additional Resources:
- Content Library Technical Deep Dive @ VMworld
- The Content Library PowerCLI module also includes three other useful functions: Remove-SubscribedContentLibrary, Remove-LocalContentLibrary & Copy-ContentLibrary that maybe worth checking out for Automation purposes
- Content Library Developer Blog Series
SiliconBrian says
Is there a requirement that the replicated target datastore be writable?
i.e. could it be a read-only NFS mount?
William Lam says
If you're asking whether the "source" (Publisher) can be read-only ... if you setup a 3rd Party Content Library (https://www.williamlam.com/2015/06/creating-your-own-3rd-party-content-library-for-vsphere-6-0-vcloud-director-5-x.html), then you can. However, if you're going through vCenter Server and creating that source library, then it needs to be writeable for you to get content into Content Library 🙂
Tim K. says
What about a CL that is replicated across the datacenters in one VC? Say you have a v6.5 VCSA that manages multiple DCs' across the US, Mexico and Canada. You have standard images you want to deploy but you do not want the WAN bandwidth at either location sucked up every time you deploy a VM. Is there a way to have a "Templates" LUN/NFS/iSCSI mount at all the remote sites that you want to replicate the main CL to? Once the initial replication is done (hopefully to be scheduled for off hours) you would be all set to go.
Thank you.
William Lam says
Tim,
Great question, yes this is possible w/CL. Please have a look at the VMworld Technical Deep Dive session linked in the Additional Resource section of the post 🙂
James McEwan says
I'm in the process of upgrading my vCenter from 6.0 to 6.5 (not using the migration tool due to a topology change). Is it possible to import / discover the existing content library in the new vCenter, without copying the data? Note that the datastore backing the content library will be the same in both vCenters.
Andreas Cederlund says
The link "Content Library Technical Deep Dive @ VMworld" does not seem to be working properly anymore 🙁
William Lam says
Just fixed. You can find all past VMworld sessions on VMworld site itself.
Andreas Cederlund says
Excellent, thanks! Will take a look at it, I have on my agenda to set up CL with replication between our different vCenters and storage systems. One quick question - do you know if the hosts also need access to the NFS share that VCSA has access to?
William Lam says
Not exactly sure what you mean by that ... if you're talking about Ext replication, this is all done outside of VC and ESXi.
Andreas Cederlund says
When setting up a CL, you can choose to use NFS/SMB or a VMFS datastore. If using NFS, will the hosts need access to the same NFS share, or will the content be "streamed" via the vCenter when access?
Avi says
Can CL be used for sharing vibs?
Isaac says
Is there or will there be a way to reverse the relationship of the subscribe content library to a publish library and vice versa?
thanks
Spiros says
Hello William,
We want to do the same thing as "Tim K." wrote above but we cannot find the video : "Content Library Technical Deep Dive @ VMworld" from your link. Can you please paste the code of the video (each video has code in the title) ?
Thank you very much
PS: Always enjoy your articles!
Ronno says
I also have the same question as Tim K and Spiros above! i.e one vcenter and many ESXi hosts in different datacenters. Thanks!!!!
Eric Lee says
If anyone is wondering this setup also works when using a single datastore across multiple Content Libraries. I'm pointing all Content Libraries to the same folder.
Marek says
I know this article is fairly old I'm still having difficult to find out how the CL traffic flows in my scenario.
Let's say I have 2 vmkernels in esxi with different Ips let's say IP1 and IP2. Which IP will be used by CL replication process?
William Lam says
That's going to depend on which transfer method is being used. There's two ways data can transfer for CL - "Streaming" (HTTP) or "Copy" (NFC) with the latter being host to host communication as the most efficient where vCenter is not in the datapath and the former is through vCenter Server and it'll have connectivity to host based on address used to add to inventory.
This article is specific about external replication (e.g. handled outside of vCenter and ESXi)