A few years back I had blogged about creating your own 3rd Party vSphere Content Library enabling customers to take advantage of different types of storage backing than just vSphere Datastores. The primary requirement was that the content endpoint was accessible over HTTP(s), which meant that a number of solutions could be used from a simple web server like Nginx to an advanced distributed object store like Amazon S3 for example.
The workflow to create a 3rd Party vSphere Content Library on S3 is fairly straight forward, here is high level summary:
- Organize the content on a local system (desktop)
- Run a python script to index and generate the Content Library metadata
- Upload the Content Library to S3
A disadvantage of the above solution is that each time you need to update or remove content, the entire process would have to be repeated again, including re-uploading the changes. Not only was this time consuming from an operational standpoint but now you also needed to also keep a full copy of all the content locally which can be several hundred gigabytes, if not more.
This topic was recently brought back up again by Gilles Chekroun, an SE in our Networking and Security Business Unit who reached out to see if there was a solution to help his customer who was running into this challenge. Over the last couple of weeks, I had been working with both Gilles and Eric Cao (Content Library Engineer) on how we could enhance the existing Python script which indexes and generates the Content Library metadata to also support running directly on Amazon S3 bucket.