With the ability to to share a single NVMe device for both NVMe Tiering and a local VMFS datastore ... I had an idea to push this further and see if I could also get an ESXi-OSData partition running on the same shared NVMe device! 🤔
Simliar to the previous blog post, the underlying use case is really for dev/test environment where you may not have a ton of NVMe devices to dedicate to the various ESXi functions, especially true for those using small form factor (SFF) systems like an ASUS NUC or simliar. Most of the mainstream SFF systems usually comes with two, maybe three NVMe slots if you are lucky.
This technique would allow you to boot ESXi off of USB and then have key functions like ESXi-OSData and NVMe Tiering on a single shared NVMe while freeing up the other NVMe devices for use with vSAN, which you should have dedicated devices for whether you are considering vSAN OSA or ESA.
Disclaimer: This is not officially supported by VMware, please use at your own risk.
Step 1 - Ensure that you have an empty NVMe device, you can not use an existing device with any existing partitions. You can use the vdq -q command to identify and retrieve SSD device name.
Step 2 - Download the createSharedNVMeTeiringOSDataAndVMFSPartitions.sh shell script to your ESXi host and update the four required variables:
- SSD_DEVICE - Name of the NVMe device from Step 1
- NVME_TIERING_SIZE_IN_GB - Specify the amount storage (GB) that you wish to use for NVMe Tiering
-
OSDATA_SIZE_IN_GB - Specify the amount storage (GB) that you wish to use for ESXi-OSData
- VMFS_DATASTORE_NAME - Name of the VMFS datastore to create on NVMe device
Ensure the script has executable permission (chmod +x /tmp/createSharedNVMeTeiringOSDataAndVMFSPartitions.sh) before attempting to run the script.
Note: Due to the complexity of the commands, the script will automatically print the commands AND then run the commands as shown in the screenshot below.
Here is an example of running the script for my setup where I have 1TB (913.15GB) NVMe and I am allocating 256GB for NVMe Tiering, 32GB for ESXi-OSData and the remainder space will be allocated for VMFS datastore.
If you have ESXi running on a USB device, which is how my setup is configured, you will notice there is an existing ESXi-OSData that is running on ramdisk annotated by the volume label LOCKER-XXX using the esxcli storage filesystem list command. After running the script, you will see a secondary ESXi-OSData volume and the very last command is to update the ESXi-OSData configuration location to point to our new partition.
Using the ESXi Host Client, we can see the three partitions that we have now created:
Step 4 - Enable the NVMe Tiering feature, if you have not already by running the following ESXCLI command:
esxcli system settings kernel set -s MemoryTiering -v TRUE
Step 5 - Configure the desired NVMe Tiering percentage (25-400) based off of your physical DRAM configuration by running the following command:
esxcli system settings advanced set -o /Mem/TierNvmePct -i 400
Step 6 - Finally, reboot for the NVMe Tiering settings and new ESXi-OSData configuration to go into effect. Once your ESXi host reboots, you will now have a single NVMe device supporting NVMe Tiering, ESXi-OSData and local VMFS datastore for you to use for workloads!
Step 7 - If you initially installed ESXi on a USB device and did NOT configure ESXi-OSData volume, one additional step is needed to copy over the packages and vmware directory into the new ESXi-OSData volume. You can use following ESXCLI command esxcli storage filesystem list to view the LOCKER-* volume path and then OSDATA-* is your new ESXi-OSData volume as shown in the earlier screenshot.
In this example, the LOCKER-6755b968-c118cea2-656e-88aedd7138d4 mount point is /vmfs/volumes/6755b968-c118cea2-656e-88aedd7138d4 and OSDATA-6755c79c-20be01ee-f3e2-88aedd7138d4 mount is /vmfs/volumes/6755c79c-20be01ee-f3e2-88aedd7138d4
Run the following commands to copy the two directories:
cp -rf /vmfs/volumes/6755b968-c118cea2-656e-88aedd7138d4/packages /vmfs/volumes/6755c79c-20be01ee-f3e2-88aedd7138d4
cp -rf /vmfs/volumes/6755b968-c118cea2-656e-88aedd7138d4/vmware /vmfs/volumes/6755c79c-20be01ee-f3e2-88aedd7138d4
Just wanted to say thanks for writing this up, as I was wondering if this was possible after reading the other writeup. It'll be great for the homelab!
Hello,
Thank you for the tips, It works very well.
Anyway, I would like to change NVME SSD, but after delete partition for Tiering Memory and Datastore VMFS, I can't delete partition OSDATA.
I installed ESXi on a usb device, and it seems the OSDATA partition superseed the LOCKER partition.
Each attempt give me about the same error :
Error: Read-only file system during write on /dev/disks/...
Unable to delete partition 2 from device.
Do you know if there are a easy way to delete OSData partition after your script ?
Thanks
Yes. You need to boot gparted to delete, you can’t do it while it’s ruining ESXi
Hi William,
Can i do the same for memory tiering and vsan cache volume (OSA) ?
Like 256Gb for RAM tiering and the rest for the OSA cache
Hi William, I encountered problem when following the steps listed.
Errors like:
Invalid number of tokens
Invalid partition information: 3 201326591 AA31E02A400F11DB9590000C2911D1B8 0
Invalid Partition information
I also raised issue in Github with more detail info. Could you please help?
I think I figured out why by myself - My SSD is completely new! I had to "format" it somehow first before running this script - I used ESXi to create a VMFS datastore on it in this case. Thanks a lot anyway.
Unfortunately, though creating a VMFS first on new SSD can resolve the error in the first step. Follow steps still gave error, after rebooting the ESXi, memory tiring works, but no datastore can be used on the SSD. I'm trying to start all over again to see what to do. For now, even failed to delete partition 2 like David encountered above.
Update - Issue resolved by starting all over again.
Key step is using gparted-live CD to boot the machine and delete all partitions in NVME SSD.
VMFS partition was definitely a wrong start. And not sure what kind of partition my new SSD had when it came, I didn't notice, but it didn't work with the script for sure. The partition tools in ESXi has limitations, so don't expect it will work well.
So initializing the SSD completely would be a safe choice, then everything will go well.
Is there a reason the you are still booting ESXi off a thumb drive in this setup? Is it not possible to achieve this setup with all the components running off the NVMe storage?
Specifically, I'm thinking about this in the context of your post on the GMKtec K11. If you go the route of having a third SSD by replacing the wireless card, could you install all the components plus a VMFS datastore on that third SSD, leaving the two 2280 available for whatever you want? (and thus eliminating the need to still boot from USB?)
Thank you.