WilliamLam.com

  • About
    • About
    • Privacy
  • VMware Cloud Foundation
  • VKS
  • Homelab
    • Resources
    • Nested Virtualization
  • VMware Nostalgia
  • Apple
You are here: Home / Automation / vSphere Datasets - New Virtual Machine Metadata Service in vSphere 8

vSphere Datasets - New Virtual Machine Metadata Service in vSphere 8

09.21.2022 by William Lam // 1 Comment

Since the early days of Virtual Center and ESX, the only method for creating and sharing arbitrary metadata between the vSphere Management layer and the guest operating system was to use either guest variables (guestinfo) or the OVF runtime environment.

While both of these capabilities have enabled a ton of interesting use cases and have even inspired creative solutions such as this, this, this, this and this to just name a few, it certainly has its challenges and nuances from an end user experience perspective.

For example, the persistency or the non-persistency of guest variables solely depended on when it was applied to a Virtual Machine and the power state it was in, which can be very frustrating to discover for the first time and the inconsistent behavior for end users. The lack of security and access control in both guest variables and the OVF runtime environment also means the metadata could easily be overwritten or removed by users in either the vSphere Management layer or guest operating system, making this challenging to scale for larger organizations.

This is why I am excited for vSphere 8 and the new vSphere Dataset feature!

Use Cases

Here are some of the use cases that can benefit from vSphere Datasets:

  • Arbitrary metadata for a Virtual Machine (e.g. System Owner, Application Owner, Location, etc.)
  • Coordinating Application Workflow/Installation signals to or from the vSphere Management layer
  • Application Development and build system (e.g. dynamic application configuration and build artifacts)
  • Configuration management system and tools to publish installed applications and OS details

With the new capabilities of vSphere Datasets, I am pretty sure our customers will find plenty more use cases that this solution can now address.

Requirements

  • vCenter Server and ESXi must be running vSphere 8.0
  • Virtual Machine must be configured VM Compatibility Version 20
  • VMware Tools 11.3 or greater

vSphere Datasets

So, what are vSphere Datasets? It is currently a vSphere API only capability that provides a facility to share data through a collection of key/value pairs between the vSphere Management layer and guest operating system. The type of data should be relatively small and change infrequently.

What are the benefits of vSphere Datasets over the previous solutions?

  • Better Security Model
    • Improved vSphere API and access management
    • Privileged access from guest operating system
  • Support for Large Scale
    • Large number of vSphere Dataset per VM
    • Up to 100MB in capacity
  • Easy User Experience
    • Dataset-entry hierarchy
    • vSphere REST API for management
    • Guest Operating system commands
  • Data Persistence
    • Persist data across power cycle
    • Optional omit or include data for snapshot / clone operation

How do vSphere Datasets work? Using the vSphere REST API, you would first create a dataset, which acts as a container for the actual data which are stored as dataset entries (key/value) pairs. A dataset includes basic information such as the name and description but it also includes access control policies for both the vSphere Management layer as well as the guest operating system, which can be NONE, READ_ONLY or READ_WRITE. Lastly, you can also specify whether a given dataset will be included as part of a VM clone or snapshot operation.

Once a dataset has been created, dataset entries can be added from either the vSphere Management layer and/or guest operating system, which is determined by the access control policies configured for a given dataset. One huge improvement over the previous solutions is that you can have multiple datasets that have different access control policies for different use cases for a given VM, which makes this an extremely flexible and powerful capability.

Lets take a look at a few concrete examples:

In the example below, entries in this dataset can be read/created/updated/deleted by the vSphere Management layer, but the guest will have no access

Property Value
Name admin-ds
Host Access READ_WRITE
Guest Access NONE

In the example below, entries in this dataset can be read/created/updated/deleted by the vSphere Management layer and guest will have read only access

Property Value
Name shared-admin-ds
Host Access READ_WRITE
Guest Access READ_ONLY

In the example below, entries in this dataset can be read/created/updated/deleted by the guest and the vSphere Management layer will have read only access

Property Value
Name shared-user-ds
Host Access READ_ONLY
Guest Access READ_WRITE

In the example below, entries in this dataset can be read/created/updated/deleted by the guest but the vSphere Management layer will have no access

Property Value
Name user-ds
Host Access NONE
Guest Access READ_WRITE

Here is a quick overview of the different vSphere Management and Guest APIs to manage both vSphere Datasets and vSphere Dataset Entries. The vSphere Management API is available through the existing vCenter Server REST API and the Guest APIs are available through the guest operating system via the vmtoolsd command-line utility.

vSphere REST API for vSphere Datasets & Entries

When vSphere 8 GA's, you will be able to find the complete REST API documentation here, which will be broken into two sections: datasets and datasets entries. The vSphere Dataset REST API is very straight forward and can be consumed using any REST-based Client. Since a large majority of VMware customers already leverage PowerCLI for Automation purposes, I have created a PowerCLI Community Module called VMware.Community.Dataset which uses the CIS Server cmdlets to interact with the vSphere Dataset REST APIs.

The VMware.Community.Dataset module includes the following functions:
  • New-VMDataset
  • Get-VMDataset
  • Remove-VMDataset
  • New-VMDatasetEntry
  • Get-VMDatasetEntry
  • Remove-VMDatasetEntry

Step 1 - Install the VMware.Community.Dataset module using the following command:

Install-Module VMware.Community.Datasets

Step 2 - Next, connect to the CIS Server endpoint which will be the IP Address/FQDN of your vCenter Server using:

Connect-CisServer -Server 192.168.30.213 -User *protected email* -Password VMware1!

Step 3 - Import the VMware.Community.Dataset module and you are now ready to start automating vSphere Datasets

Import-Module VMware.Community.Dataset

Here is an example creating several datasets using the New-VMDataset function with different access control policies based on the concrete examples described above earlier.

$vm_moref = "vm-26"

$adminDataSetParam = @{
    Name = "admin-ds";
    Description = "Dataset for Admins";
    VMMoref = $vm_moref;
    GuestAccess = "NONE";
    HostAccess = "READ_WRITE";
    OmitFromSnapshotClone = $false;
}
New-VMDataset @adminDataSetParam

$sharedDataSet1Param = @{
    Name = "shared-admin-ds";
    Description = "Dataset for Admins and RO for Users";
    VMMoref = $vm_moref;
    GuestAccess = "READ_ONLY";
    HostAccess = "READ_WRITE";
    OmitFromSnapshotClone = $false;
}
New-VMDataset @sharedDataSet1Param

$sharedDataSet2Param = @{
    Name = "shared-user-ds";
    Description = "Dataset for Users and RO for Admins";
    VMMoref = $vm_moref;
    GuestAccess = "READ_WRITE";
    HostAccess = "READ_ONLY";
    OmitFromSnapshotClone = $false;
}
New-VMDataset @sharedDataSet2Param

$userDataSetParam = @{
    Name = "user-ds";
    Description = "Dataset for Users";
    VMMoref = $vm_moref;
    GuestAccess = "READ_WRITE";
    HostAccess = "NONE";
    OmitFromSnapshotClone = $false;
}
New-VMDataset @userDataSetParam


Here is an example listing all datasets using the Get-VMDataset function.


Here is an example retrieving the configuration for specific dataset by using the Get-VMDataset function and specifying the name of a dataset.


Here is an example creating several dataset entries for different datasets using the New-VMDatasetEntry function.

$adminDataSetEntry1Param = @{
    Name = "Location";
    VMMoref = "vm-26";
    Dataset = "admin-ds";
    Value = "Palo Alto";
}
New-VMDatasetEntry @adminDataSetEntry1Param

$adminDataSetEntry2Param = @{
    Name = "Building";
    VMMoref = "vm-26";
    Dataset = "admin-ds";
    Value = "Promontory E";
}
New-VMDatasetEntry @adminDataSetEntry2Param

$sharedDataSetEntry1Param = @{
    Name = "AppID";
    VMMoref = "vm-26";
    Dataset = "shared-admin-ds";
    Value = "app-1234";
}
New-VMDatasetEntry @sharedDataSetEntry1Param

$sharedDataSetEntry2Param = @{
    Name = "SystemOwner";
    VMMoref = "vm-26";
    Dataset = "shared-admin-ds";
    Value = "William Lam";
}
New-VMDatasetEntry @sharedDataSetEntry2Param


Here is an example listing all dataset entries for a specific dataset using the Get-VMDatasetEntry function.


Here is an example retrieving the value for a specific dataset entry also using the Get-VMDatasetEntry function.


To delete specific dataset entry, you can use Remove-VMDatasetEntry function. To delete a dataset, all dataset entries must be first removed and then you can use the Remove-VMDataset function.

If you attempt to access a dataset that you do not have access to, you will get an unauthorized error like the following:

Guest API for vSphere Datasets & Entries

vSphere Datasets and their entries can also be accessed from within the guest operating system using the vmtoolsd utility. Depending on the access control policies, you may or may not have the permissions to list and/or manipulate individual datasets.

To list all configured datasets, you can run the following command:

vmtoolsd --cmd 'datasets-list' | python -m json.tool


Note: The python command is optional and is only used to nicely format the JSON output for readability

To view the configuration for a given dataset, you can run the following command:

vmtoolsd --cmd 'datasets-query {"dataset":"shared-admin-ds"}' | python -m json.tool


With more complex dataset commands, using the --cmd may not be ideal and vmtoolsd provides another parameter called --cmdfile which accepts a file that contains the commands (between the single tick marks) and simply processes that instead. Below is the previous command by now reading from a file instead

# cat ds-command
datasets-query {"dataset":"shared-admin-ds"}

# vmtoolsd --cmdfile=ds-command | python -m json.tool
{
    "result": true,
    "info": {
        "name": "shared-admin-ds",
        "description": "Dataset for Admins and RO for Users",
        "used": 35,
        "hostAccess": "READ_WRITE",
        "guestAccess": "READ_ONLY",
        "omitFromSnapshotAndClone": false
    }
}

Note: The size limit for JSON requests is 64KB and for responses it is 1MB

To list entries for a given dataset, you can run the following command:

vmtoolsd --cmd 'datasets-list-keys {"dataset":"shared-admin-ds"}'


To view a specific entry from a dataset, you can run the following command:

vmtoolsd --cmd 'datasets-get-entry {"keys": ["SystemOwner"], "dataset":"shared-admin-ds"}' | python -m json.tool


If you recall earlier when creating our datasets, we had two datasets (shared-user-ds and user-ds) where the vSphere Management layer does not have permissions to create entries. Let's now take a look at dataset entry management using vmtoolsd utility.

To create/update one or more entries for a given dataset, you can run the following command:

vmtoolsd --cmd 'datasets-set-entry {"dataset":"user-ds", "entries": [{"key": "AppConfigPath", "value": "/opt/vmware/mycustomapp/config.json"}, {"key": "AppRetry", "value": "88"}]}' | python -m json.tool


To view one of more specific entries from a dataset, we can run the following commands:

# vmtoolsd --cmd 'datasets-get-entry {"keys": ["AppConfigPath"], "dataset":"user-ds" }' | python -m json.tool
{
    "result": true,
    "entries": [
        {
            "AppConfigPath": "/opt/vmware/mycustomapp/config.json"
        }
    ]
}

# vmtoolsd --cmd 'datasets-get-entry {"keys": ["AppConfigPath", "AppRetry"], "dataset":"user-ds" }' | python -m json.tool
{
    "result": true,
    "entries": [
        {
            "AppConfigPath": "/opt/vmware/mycustomapp/config.json"
        },
        {
            "AppRetry": "88"
        }
    ]
}

To delete one or more specific entries from a dataset, we can run the following command:

vmtoolsd --cmd 'datasets-delete-entry {"keys":["AppConfigPath","AppRetry"], "dataset": "user-ds"}


I think vSphere Datasets will open a ton of new possibilities whether that is for our customers, partners and even second party solutions from VMware. I can not wait to hear how you and your organization will leverage the powerful new vSphere Dataset feature!

More from my site

  • Quick Tip - vCenter Server Advanced Settings Reference
  • Downgrading new VMware vSphere Foundation (VVF) or VMware Cloud Foundation (VCF) licenses to 7.x
  • Updating handshakeTimeoutMs setting for ESXi 7.x & 8.x using configstorecli
  • Identifying vSphere with Tanzu Managed VMs
  • Quick Tip - New remote version of ESXCLI 8.x

Categories // Automation, PowerCLI, vSphere 8.0 Tags // vSphere 8.0, vSphere Datasets

Comments

  1. *protectedRussell Hamker says

    09/21/2022 at 7:36 pm

    pretty cool. Thanks William for the great blog on this!

    Reply

Thanks for the comment!Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Search

Thank Author

Author

William is Distinguished Platform Engineering Architect in the VMware Cloud Foundation (VCF) Division at Broadcom. His primary focus is helping customers and partners build, run and operate a modern Private Cloud using the VMware Cloud Foundation (VCF) platform.

Connect

  • Bluesky
  • Email
  • GitHub
  • LinkedIn
  • Mastodon
  • Reddit
  • RSS
  • Twitter
  • Vimeo

Recent

  • VMware Flings is now available in Free Downloads of Broadcom Support Portal (BSP) 05/19/2025
  • VMUG Connect 2025 - Minimal VMware Cloud Foundation (VCF) 5.x in a Box  05/15/2025
  • Programmatically accessing the Broadcom Compatibility Guide (BCG) 05/06/2025
  • Quick Tip - Validating Broadcom Download Token  05/01/2025
  • Supported chipsets for the USB Network Native Driver for ESXi Fling 04/23/2025

Advertisment

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

Copyright WilliamLam.com © 2025

 

Loading Comments...