Creating a vSphere Content Library directly on Amazon S3

A few years back I had blogged about creating your own 3rd Party vSphere Content Library enabling customers to take advantage of different types of storage backing than just vSphere Datastores. The primary requirement was that the content endpoint was accessible over HTTP(s), which meant that a number of solutions could be used from a simple web server like Nginx to an advanced distributed object store like Amazon S3 for example.

The workflow to create a 3rd Party vSphere Content Library on S3 is fairly straight forward, here is high level summary:

Organize the content on a local system (desktop)
Run a python script to index and generate the Content Library metadata
Upload the Content Library to S3

A disadvantage of the above solution is that each time you need to update or remove content, the entire process would have to be repeated again, including re-uploading the changes. Not only was this time consuming from an operational standpoint but now you also needed to also keep a full copy of all the content locally which can be several hundred gigabytes, if not more.

This topic was recently brought back up again by Gilles Chekroun, an SE in our Networking and Security Business Unit who reached out to see if there was a solution to help his customer who was running into this challenge. Over the last couple of weeks, I had been working with both Gilles and Eric Cao (Content Library Engineer) on how we could enhance the existing Python script which indexes and generates the Content Library metadata to also support running directly on Amazon S3 bucket.

A huge thanks to Eric for the script enhancements, the new version of the script which you can download here can now index content both locally as well as a remote S3 bucket. There is really neat side affect of this enhancement for our VMware Cloud on AWS (VMC) customers which I think is quite interesting. In case you did not know, S3 usage (ingress/egress) from a customers SDDC is 100% free for VMC customers by simply using a linked S3 endpoint to your VPC. This means you can take advantage of S3 to store your templates, ISOs and other static files, which can also be shared by other SDDCs. This means you are not consuming any of your primary storage for static content and can be used for what it was meant, for your. This is pretty cool if you ask me!

The new workflow to create a 3rd Party vSphere Content Library on S3 is as follows:

Upload and organize the content on S3
Run a python script to index and generate the Content Library metadata

With the ability to now remotely index and generate the Content Library metadata files, it means you no longer have to keep a local copy of all your content. All changes can be made directly on the S3 bucket and then you simply re-running the script to generate the updated metadata which can even be scheduled as a simple cron job. Gilles also did a nice write-up here which walking you step by step from S3 bucket creation including permissions to running the script and then consuming the 3rd Party vSphere Content Library in VMC, definitely recommend a read if you are using this for the first time whether you are a VMC customer or not. I think this is just the first step on some really interesting innovations that we can drive with vSphere Content Library and taking advantage of solutions like Amazon S3. In fact, this is quite timely as Jon Kensy, a VMware customer recently published an article here sharing his own thoughts on what native S3 support from vSphere Content Library could look like and its benefits both to on-premises but also VMware customers. What do you think?

Comments

José Manuel Hernández says

08/31/2018 at 4:03 am

Thanks for the article, helpfull for AWS and VMware customers.

Jason Friedrich says

08/10/2020 at 6:31 pm

Hi William,

as always, great content. I was wondering, is there a way to add a password to the S3 Content Library? I have configured a S3 bucket with basic auth, but I cant specify a user name when I try to subscribe. Is there a default username that is used? Or is it a different mechanism than basic auth (our TAM told me in an email exchange it was basic auth).

Best,
Jason

- Jason Friedrich says
  
  08/10/2020 at 7:20 pm
  
  I can answer my own question. Yes, it does use basic auth, but the username is fixed to "vcsp". I will write a blog post with the details, but I now have my own password-protected Content Library on a S3 bucket! Awesome! 🥳
  
  - Rasmus Haslund says
    
    09/24/2020 at 9:36 pm
    
    Did you make the post? 🙂
    
  - Narasimha Murthy Gangaiah says
    
    12/01/2021 at 11:07 am
    
    That's cool. Not sure why vmware did not make username configurable. configuring a S3 with cloudfront, for vcsp user access only must provide some basic access security.

More from my site

Comments

Thanks for the comment!Cancel reply