As hinted in my earlier blog post, you can indeed setup a vSAN Witness using the ESXi-Arm Fling running on a Raspberry Pi (rPI) 4b (8GB) model. In fact, you can even setup a standard 2-Node or 3-Node vSAN Cluster using the exact same technique. For those familiar with vSAN and the vSAN Witness, we will need to have at least two storage devices for the caching and capacity tier.
For the rPI, this means we are limited to using USB storage devices and luckily, vSAN can actually claim and consume USB storage devices. For a basic homelab, this is probably okay but if you want something a bit more reliable, you can look into using a USB 3.0 to M.2 NVMe chassis. The ability to use an M.2 NVMe device should definitely provide more resiliency compared to a typical USB stick you might have lying around. From a capacity point of view, I had two 32GB USB keys that I ended up using which should be plenty for a small setup but you can always look at purchasing large capacity given how cheap USB devices are.
Disclaimer: ESXi-Arm is a VMware Fling which means it is not a product and therefore it is not officially supported. Please do not use it in Production.
With the disclaimer out of the way, I think this is a fantastic use case for an inexpensive vSAN Witness which could be running at a ROBO/Edge location or simply supporting your homelab. The possibilities are certainly endless and I think this is where the ESXi-Arm team would love to hear whether this is something customers would even be interested in and please share your feedback to help with priorities for both the ESXi-Arm and vSAN team.
In my setup, I have two Intel NUC 9th Pro which make up my 2-Node vSAN Cluster and then an rPI as my vSAN Witness. Detailed instructions can be found below including a video for those wanting to see vSAN Witness in action by actually powering on an actual workload 😀
Prerequisite:
- Since ESXi-Arm is based on vSphere 7.0, make sure both your vCenter Server Appliance (VCSA) and ESXi-x86 is using 7.0 and NOT 7.0 Update 1
- VCSA 7.0 - (7.0c Build 16749653 or 7.0d Build 16620007)
- ESXi-x86 7.0 - (7.0 Build 15843807 or 7.0b Build 16324942)
Step 1 - Install ESXi-Arm Fling on rPI 4. For detailed instructions, please refer to the ESXi-Arm documentation.
Step 2 - We need to disable the USB Arbitrator service, so that ESXi can see the two USB storage devices. To do so, SSH to rPI and run the following commands:
/etc/init.d/usbarbitrator stop
chkconfig usbarbitrator off
Step 3 - To allow claiming of the USB storage devices for vSAN and the tagging one of the devices for "capacity" tier, the following two ESXi Advanced Settings must be enabled by running the following command:
esxcli system settings advanced set -o /Disk/AllowUsbClaimedAsSSD -i 1
esxcli system settings advanced set -o /VSAN/AllowUsbDisks -i 1
Step 4 - At this point, you will need to identify the device ID of the two USB storage devices (which should not have any partitions) that will be used to construct the vSAN Datastore for the vSAN Witness. To do so, run the following command and make note of the IDs which should be in the form of mpx.vmhbaXX
vdq -q
Step 5 - Next, we need to create a claim rule to add the enable_ssd option for both of our USB storage devices which will then allow us to tag one of the devices for our "capacity" tier. Run the following command and replace the mpx.vmhbaXX with the values in your environment.
esxcli storage nmp satp rule add -s VMW_SATP_LOCAL --device=mpx.vmhba33:C0:T0:L0 --option=enable_ssd
esxcli storage core claiming unclaim --type device --device=mpx.vmhba33:C0:T0:L0
esxcli storage nmp satp rule add -s VMW_SATP_LOCAL --device=mpx.vmhba34:C0:T0:L0 --option=enable_ssd
esxcli storage core claiming unclaim --type device --device=mpx.vmhba34:C0:T0:L0
esxcli storage core claimrule load
esxcli storage core claimrule run
Step 6 - Now we need to tag one of the USB storage devices as our "capacity" tier by running the following command:
esxcli vsan storage tag add -d mpx.vmhba34:C0:T0:L0 -t capacityFlash
If we now re-run the vdq command found in Step 4, you will see both USB storage devices are now seen as an "SSD" and one of them should be marked as IsCapacityFlash as shown in the screenshot below.
Step 7 - Lastly, we need to enable vSAN Traffic on our VMkernel interface, you can do this both in the vSphere UI or via the CLI. In the example here, I am using the CLI since you are already logged into the rPI:
esxcli vsan network ip add -i vmk0
At this point, we are now ready to create our vSAN Cluster using the rPI as a vSAN Witness Node. If you have not already attached the rPI to vCenter Server, go ahead and do so.
Step 7 - In the vSphere Cluster you wish to enable vSAN, select Configure->vSAN->Services and click Configure to start the configuration. In our setup, we will setup 2-Node vSAN Cluster and when asked for the vSAN Witness host, go ahead and locate the rPI which should pass all compatibility checks.
Step 8 - In the next screen, you will select which USB storage device will be used for the caching and capacity tier.
Step 9 - If everything was configured correctly, you should now have a 2-Node vSAN Cluster using rPI as vSAN Witness
Here is a video of demonstrating the enablement and the use of the rPI as a vSAN Witness powering on a VM!
vman.ch says
Epic, thank you!
I just replaced my Frankenstein witness with a Pi v4 8GB and I love it already.
vManDotCh says
Hey William,
Have you tried the release 17068872 as a witness?
I am not sure if my issue is related to the latest release or some kind of strange corruption that occurred to my vSAN cluster / vCenter due to the PSOD.
Long story short, when I updated my vDS to 7.0 it caused the Pi to PSOD. (I was running 16966451, appears it's a known issue as Bug #8 over at flings page).
After installing a fresh host with 17068872 and trying to "configure a stretch cluster" I get the following error in the HTML5 GUI at the review section..
"Failed to extract the requested data. Check vSphere Client logs for details."
I also tried to configure it over PowerCLI with Set-VsanClusterConfiguration
But got this error:
VSAN runtime fault on server '/VIServer=vman.sso\*protected email*:443/': Unknown server error: ''. See the event log for details..
Have you run 17068872 as a witness successfully?
I can no longer find VMware-VMvisor-Installer-7.0.0-16966451.aarch64.iso so I can't try to roll back! arg...
Thanks
vMan
William Lam says
I've not but given VDS PSOD was a known issue and should have been resolved. Have you tried NOT using VDS and see if the behavior is the same? I've not had time to try this again on the latest ESXi-Arm build but also haven't seen anyone else report this specific issue
vManDotCh says
Thanks for the reply, I changed to a Standard Switch and have the same issue when trying to add the witness to the cluster
"Failed to extract the requested data. Check vSphere Client logs for details."
Any idea how I can get the iso for 16966451 again? i wanted to try that version again and see if it solves the problem.
Thanks!
vManDotCh says
I created a cluster from scratch and was unable to add the witness, this is the error I got the error below.
(vmodl.fault.ManagedObjectNotFound) {
msg = '',
obj = 'vim.host.VsanHealthSystem:ha-vsan-health-system'
}
🙁
vManDotCh says
Just built a brand new vCenter 7.0 U1, 2 x 7.1U1 hosts and tried to add the Pi 17068872 as a witness and boom the same error 🙁
FYI for those thinking of upgrading the Pi 1706887
William Lam says
What version of vCenter were you using before? This was tested w/7.0 since ESXi-Arm is based on that version and Witness usually needs to match VC version
vManDotCh says
Just installed a fresh vCenter 7.0.0.10600 and I get the same issue when creating the cluster with the witness running 17068872 but I have lost the 16966451 iso so I can't verify if that still works.
(vmodl.fault.ManagedObjectNotFound) {
msg = '',
faultCause = ,
faultMessage = (vmodl.LocalizableMessage) [],
obj = 'vim.host.VsanHealthSystem:ha-vsan-health-system'
}
But it looks like multiple people have now reported the same issue.
https://flings.vmware.com/esxi-arm-edition/bugs/1140
https://flings.vmware.com/esxi-arm-edition/bugs/1139
William Lam says
Thanks. Let me respond on the Fling Issues and lets keep the discussion there so everyone is on same page
Sebastian says
Hi, great post!
is it possible to use it in a vsan cluster with esxi 7.0 U1 hosts and already updated diskformats? or is there a option to update the raspberry to 7.0 U1? if not, what do you think when will an arm iso with 7.0 U1 on flings?
thanks!
Petrus says
Hi,
Regarding the required storage to run the vSAN Witness inside a Pi4, what are the minimum required configuration WITH SUFFICIENT RESISTENCE? That's without using a USB memorystick that will fail in one or two months...
- Two SSD over USB 3.0 devices?
- One SSD drive over USB 3.0?
- One SSD driver over USB 3.0 and one USB 3.0 memorystick?
- etc, etc, etc.
Anyone has sufficient experience?
Thank you!
Tommy says
Hello William and thanks again for a great article!:) Is is possible to set up complete vSAN cluster just from rPis? I mean Two-node vSAN cluster from rPi and rPi vSAN witness? Thx
William Lam says
Yes, as mentioned in article both standard vSAN Cluster using 1,2 or 3 Node works or using it was vSAN Wittiness
Tommy says
Thanks for confirmation, I was amazed and confused at once when you mention it in article, couldnt believe it 🙂 Ordered HPE Microserver Gen 10 Plus and 3 rPIs for my homelab, guess that I will have some fun in coming months 🙂
Tobias says
Hi William 😉
do you have any clue what this means ?
I'm not able to add the pi as witness, nested esxi as witness works. Tried 7 U1 und actual 7.0.
(2 Node Cluster)
General vSAN error. (vmodl.fault.ManagedObjectNotFound) { msg = '', faultCause = , faultMessage = (vmodl.LocalizableMessage) [], obj = 'vim.host.VsanHealthSystem:ha-vsan-health-system' }
vmodl.fault.ManagedObjectNotFound
thx
David Sellens says
just for clarification. This setup requires 3 USB Storage devices plus the Micro SD card??? why won't ESXi install directly on the Micro SD card???
William Lam says
Since publishing this, there are a number of different ways to boot and install ESXi-Arm itself including over the network. VSAN requires a minimum of two disks hence 2 out of 3. In addition to not having drivers for ESXi on SD Card, it is also not suitable for general ESXi functionality, especially from a wear leveling point of view. This is true for ESXi-x86, so not specific to ESXi-Arm
David Sellens says
Not at all the answer that I expected. Having worked on literally hundreds of x86 ESXi hosts booting off of SD cards, I find your reluctance to use them suspect. In my experience, booting from SD card is the preferred method of running ESXi on HP Blades without regular hard drives. They have also been used heavily on various other diskless installations. Obviously the key to using them is allowing the logs to post to the SD card. I have seen one inexperienced ESXi admin not do that and end up with stacks of failed SD cards before they even went into production. Given the logs on other media, the SD cards have lasted as long as the machine itself.
So it would seem that my question was not properly stated. It was not why ESXi-x86 does not boot off of SD card as that is pretty much standard practice. It is: Why does ESXi-ARM not allow ESXi to be installed on the SD Card the same way that ESXi-x86 does?
David Sellens says
oops, forgot a not, ie. Not allowing logs to post to the SD...
William Lam says
Sorry you don't like the answer, but I've already given the answer in my initial reply.
Rick says
Hello William,
I'm booting my rPi "vmware esxi" via iscsi
I was wondering why two usb sticks.
is it possible just to add two extra iscsi lun's.
instead of the usb sticks ?
William Lam says
Yes, you'd need to tweak the ESXi Advanced Setting to allow vSAN to claim your iSCSI LUN but its doable. I went with the easiest setup 🙂
Sourav says
VCSA 7.0.0d: 7.0.0.10700-16749653
ESXi 7.0.0 7.0.0.10700-16324942
ESXi-Arm : 7.0.0, 17230755
rPI: 4B 4GB
When I try to add the rPI as witness, following error.
Error Message:
Task: vSAN operation precheck
(vmodl.fault.ManagedObjectNotFound) {
msg = '',
faultCause = ,
faultMessage = (vmodl.LocalizableMessage) [],
obj = 'vim.host.VsanHealthSystem:ha-vsan-health-system'
}
Any solutions or workarounds? Could it be that limited resources on a 4GB rPI be causing this?
notthefirstryan says
VCSA 16749653 (although build reports 16749670 after install and deploy)
ESXi 16324942
ESXi-Arm 17839012
I am seeing the exact same error message when I attempt to deploy with Pi as witness and I have 8GB Pi4 B+ sitting at 28% memory utilization.
Might just move on and use an extra Intel box at this point. This doesn't seem that promising between being stuck on specific outdated versions of ESXi/VCSA and it still not being very reliable.
David Freund says
I just tried this with the latest build: 17839012 on a 8GB rPi 4, but I'm having an issue with the SSD status.
I had to change the command slightly (esxcli storage nmp satp rule add -s VMW_SATP_LOCAL -d mpx.vmhba34:C0:T0:L0 --option=enable_ssd), but got it to run... however, it's still not displaying the USB drives as SSD's. "IsSSD" still shows as "0" so my attempt to tag a USB disk as capacity tier obviously isn't working...
(Unable to add tag to disk: Disk mpx.vmhba34:C0:T0:L0 is not flash device)
Any thoughts? Does this command need to be run differnently in build 17839012?
Thanks