The quick answer is no, reinstalling ESXi on a host that contained VSAN data will not be touched or destroyed. This, of course, assumes you do not touch any of the disks that contains your VSAN data. To further assist administrators, you can also see which disks are claimed by VSAN during the the disk selection of a reinstall/upgrade of ESXi, so the proper disk is selected.
This to me is a very important fact and probably a very conscious design decision by the VSAN Engineers to ensure that you do not accidentally wipe your data because ESXi needed to be reinstalled. The topic has been one I have been wanting to write about for awhile but just never found the time. The reason I am bringing this up is that I saw an interesting tweet the other day from fellow VMware colleague, Geordy Korte who works over in the NSBU that had the following request:
NEED HELP: Anyone have any clue on how to recreate a #VSAN using esxcli and not wiping the data. My Vcenter is on the VSAN. RT PLZ
— Geordy Korte (@gekort) July 2, 2014
Geordy needed to reinstall ESXi but he did not want to lose any of the VSAN data residing on the disks within the ESXi host which also contained his vCenter Server among other Virtual Machines. Fortunately, reinstalling ESXi is a very safe operation and it will not touch any of the VSAN partitions. In fact, after the installation, if ESXi detects there are partitions that contains VSAN data, it will automatically claim the devices for you. You may have even seen this as part of the ESXi boot process:
Going back to Geordy's original question, how do you re-create the VSAN Datastore after reinstalling ESXi? Well, the process is actually pretty straight forward and is quite similar to how you bootstrap vCenter Server onto a VSAN Datastore. To demonstrate this process, I decided to take a fully functional 3-Node VSAN setup that contained a Virtual Machine and reinstall ESXi on all three nodes. I will now take you through the process of restoring the VSAN Datastore and at the end I will also talk about an alternative approach depending on the situation you might be in.
Step 1 - You can easily check whether you have an existing VSAN Disk Group by running the following ESXCLI command:
esxcli vsan storage list
Step 2 - You will need to enable VSAN traffic type for the VMkernel interface you wish to run VSAN on by running the following command and specifying the interface (this will need to be done on ALL ESXi hosts):
esxcli vsan network ipv4 add -i vmk0
Step 3 - You will need to re-create a new VSAN Cluster and you can do so by running the following command which will generate a UUID and create a VSAN Cluster from that:
esxcli vsan cluster join -u $(python -c 'import uuid; print str(uuid.uuid4());')
Step 4 - Once the VSAN Cluster has been created, you will need to make a note of the UUID that was generated so you can join the remaining ESXi hosts to the same VSAN Cluster. To do so, you will need to run the following command:
esxcli vsan cluster get
You will need to look for "Sub-Cluster UUID" property as seen in the screenshot above highlighted in green.
Step 5 - On the remainder ESXi hosts, you can now join the the VSAN Cluster using the following command and specifying the UUID from the previous step:
esxcli vsan cluster join -u [UUID]
Step 6 - Once all ESXi hosts have re-joined the VSAN Cluster, you can now go into the VSAN Datastore and re-register all the Virtual Machines using either the vSphere C# Client or from the CLI on the ESXI Shell.
As you can see from the screenshot above, I was able to recover my Virtual Machine and there was no data loss!
If you recall earlier, I also mentioned there was an alternative approach that you could take depending on the situation. If you need to deploy a new vCenter Server for what ever reason, you can easily restore your existing VSAN Cluster by simply creating a new VSAN Cluster in the new vCenter Server. You would then add each ESXi hosts to the VSAN Cluster and the VSAN Datastore and its contents will automatically be restored without any issues. Once you have your VSAN Datastore, you can then re-create the missing VM Storage Policies and re-apply them to the respective Virtual Machines.
In this article I have described two potential scenarios when working with VSAN and in both cases you can safely recover your data without any problems. This is one of the things I have come to appreciate about VSAN is the amount of engineering effort to make it super simple to use but also very resilient!
Jake says
Thanks for sharing!
While I have a question for your point:
"If you need to deploy a new vCenter Server for what ever reason, you can easily restore your existing VSAN Cluster by simply creating a new VSAN Cluster in the new vCenter Server. You would then add each ESXi hosts to the VSAN Cluster and the VSAN Datastore and its contents will automatically be restored without any issues."
What will happen when a DVS was configured in old vCenter server and ESX servers were using it?
Russell says
So, when running a virtualized lab, I rebuild ESXi hosts constantly, but I want them to come up with clean disks. Is there a simple command I could run to wipe the disks? Using Fusion 7.1 I don't want to have to remove and re-add disks because of the special disk settings required for running vSAN.
Jose Hernandez says
William, I have a situation were I need to reinstall ESXi on all my VSAN 5.5 nodes. No need to go into details why that is. I have a four node cluster with two disk groups per node. Once each host is rebuilt, I will have to rejoin it back to the VSAN cluster using the method above before going to the next host. My question pertains to which maintenance mode to use on the host prior to reinstalling ESXi, 'ensure accessibility' or 'full data migration'. Will 'ensure accessibility' be good enough and when I rejoin the host back to the VSAN will the existing data be recognized by the cluster. Or am I better off doing a 'full data migration' even if it means the storage won't be balanced across hosts when I'm done with the re-installs?
Jose Hernandez says
To clarify the question above, I'm talking about reinstalling ESXi on each VSAN node in a four node cluster in a rolling manner without taking an outage on running VMs.
William Lam says
Jose,
You'll want to take a look at the VSAN Operational Guide which covers rebuild. For your scenario, if the host will be back within the default 30min (IIRC), then VSAN will not automatically rebuild the data, but if it will take longer and you have selected ensureAccessiblity, then it'll start the rebuild process. If you plan to decommission or it may take longer to re-install, then you may opt for the full data migration with the note that it'll take a bit longer to move all data off the disks. Definitely recommend checking out the official docs which should cover this use case
Ronald says
Hello William,
By accident, one of my customers remove a Disk Group from VSAN Cluster to replace a disk without set the host in maintenance mode and flush the VMs, I know will be difficult, but there is any way or chance to recover those VMs inside that Disk Group? Please if you can let me know as soon you can, this is critical for my client
Thanks in advance
Sergey Chalykh (@schalykh) says
William,
Here is my scenario. I've upgraded my lab to 6.5 recently. there are 3 Dell servers configured with vSAN. Then it was a task to dill vCenter and get it restored with VDP 6.1.4 appliance. During restore VDP asked me to disassociate ESXI host from vCenter (which is being restored). So I've done this and have been able to restore vCenter appliance. However few minutes after I powered vCenter on I've received a nightmare. I was able to ping vCenter IP address for a little bit and then it was gone. What happened is that ESXI host lost vSAN. I tried pinging VPD and it was not successful either. Then VDP came back up on another ESXI host that is participating in vSAN. I've had a previous vCenter version (6.0) on that vSAN volume and was able to power that one however I haven't been able to join/connect ESXI 6.5 hosts back to vCenter 6.5. apparently this is not supported. So right now don't have working vCenter. Windows vSphere client is useless.
Then I downgraded with an ISO server that lost vSAN and was able to add it to the vCenter 6.0 vSAN version hasn't been upgraded yet so it should be compatible with vCenter 6.0.
There was an option to upgrade ESXI during installation. however i've selected install. Do you know if I will loose vSAN and data from other 2 nodes if I perform ISO downgrade and select Upgrade when the prompt appears?
I tried joining the server that I've downgraded to existing vSAN using esxcli vsan cluster join -u [UUID_from_functional_cluster]
but it was not joining them and i could only see empty 1 server vSAN volume.
Please let me know.
Thank you,
Sergey Chalykh (@schalykh) says
Just wanted to add few more words.
I'm afraid performing another vCenter 6.5 restore from VDP due to that vSAN issue on the first host. Currently I have 2 server with vSAN volume 4.4Tb and 2.7Tb are being used. ESXI OS is located on 8Gb SD card that is embedded in Dell servers.
Also I was able to browse vSAN datastore and add to inventory vCenter 6.5 however I don't have IP connectivity when turning it on. the interface is there and associated with correct network connection/distributed switch/vlan association. However even from the console of that vCenter i can't ping anything.
Anything can be done from SSH on ESXI hosts to get things back to vCenter 6.0?
William Lam says
Please don't touch the environment and call GSS ASAP. That is my recommendation so they can help you properly recover
Sergey Chalykh (@schalykh) says
I've messed up my lab and had to rebuild everything. I've been able to launch recovered vCenter 6.5 on one of the ESXI hosts. Had some difficulties with certs and when i fixed them and VM started initialization of vCenter services it dropped vSAN on the second server. At that point I've lost all the data. VMs became corrupted. It took me whole Memorial weekend to get things recovered back. Luckily I've created DB backups of application servers that were running in the lab prior to this.
I've learned only one thing. Running VPD and using vSAN is not a good idea especially if vCenter dies and it needs to be recovered with VDP. Unless you have another storage where vCenter can be recovered an run.
wojcieh says
Thanks William,
It worked like a charm.
RobertoBNeto says
Hi William! I have a difficult situation: rasonware. So i need to rebuild a hole VSAN two node cluster, even the vcenter, to access the datastore e get the virtual machines. This method, needs anymore steps to access vsandatastore? I can see the vsanDatastore in the new vcenter, but dont show the files.
Thank you any help!!!
William Lam says
If this is for production, please file SR to get proper support. Don’t make any changes
You should be re-mounting vSAN Datastore
Stevo says
Any chance this article could be updated for vSAN ESA? Hoping it's still possible to reinstall ESXi and create a new cluster without losing my data.
William Lam says
Yes, still applicable