I know Cormac Hogan already wrote about this topic awhile ago, but there was a question that was recently brought up that included a slight twist which I thought it would be useful to share some additional details. The question that was raised: How do you properly shutdown an entire VSAN Cluster when vCenter Server itself is also running on the VSAN Datastore? One great use case for VSAN in my opinion is a vSphere Management Cluster that would contain all your basic infrastructure VMs including vCenter Server which can be bootstrapped onto a VSAN Datastore. In the event that you need to shutdown the entire VSAN Cluster which may also include your vCenter Server, what is the exact procedure?
To help answer this question, I decided to perform this operation in my own lab which contains a 3-Node (physical) VSAN Cluster that had several VMs running on the VSAN Datastore including the vCenter Server VM that was managing the VSAN Cluster.
Below are the steps that I took to properly shutdown down a VSAN Cluster as well as powering everything back on.
UPDATE (4/27) - Added instructions for shutting down a VSAN 6.0 Cluster when vCenter Server is running on top of VSAN.
Shutdown VSAN Cluster (VSAN 6.0)
Step 1 - Shutdown all Virtual Machines running on the VSAN Cluster except for the vCenter Server VM, that will be the last VM you shutdown.
Step 2 - To help simplify the startup process, I recommend migrating the vCenter Server VM to the first ESXi host so you can easily find the VM when powering back on your VSAN Cluster.
Step 3 - Ensure that there are no vSAN Components being resync'ed before proceeding to the next step. You can find this information by going to the vSAN Cluster and under Monitor->vSAN->Resyncing Components as shown in the screenshot below.
Step 4 - Shutdown the vCenter Server VM which will now make the vSphere Web Client unavailable.
Step 5 - Next, you will need to place ALL ESXi hosts into Maintenance Mode. However, you must perform this operation through one of the CLI methods that supports setting the VSAN mode when entering Maintenance Mode. You can either do this by logging directly into the ESXi Shell and running ESXCLI locally or you can invoke this operation on a remote system using ESXCLI.
Here is the ESXCLI command that you will need to run and ensure that "No Action" option is selected when entering Maintenance Mode:
esxcli system maintenanceMode set -e true -m noAction
Step 5 - Finally, you can now shutdown all ESXi hosts. You can login to each ESXi hosts using either the vSphere C# Client / ESXi Shell or you can also perform this operation remotely using the vSphere API such as leveraging PowerCLI as an example.
Shutdown VSAN Cluster (VSAN 1.0)
Step 1 - Shutdown all Virtual Machines running on the VSAN Cluster except for the vCenter Server VM.
Step 2 - To help simplify the startup process, I recommend migrating the vCenter Server VM to the first ESXi host so you can easily find the VM when powering back on your VSAN Cluster.
Step 3 - Place all ESXi hosts into Maintenance Mode except for the ESXi host that is currently running the vCenter Server. Ensure you de-select "Move powered-off and suspend virtual machines to other hosts in the Cluster" as well as selecting the "No Data Migration" option since we do not want any data to be migrated as we are shutting down the entire VSAN Cluster.
Note: Make sure you do not shutdown any of the ESXi hosts during this step because the vCenter Server VSAN Components are distributed across multiple hosts. If you do this, you will be unable to properly shutdown the vCenter Server VM because its VSAN components will not available.
Step 4 - Shutdown the vCenter Server VM which will now make the vSphere Web Client unavailable.
Step 6 - Finally, you can now shutdown all ESXi hosts. You can login to each ESXi hosts using either the vSphere C# Client / ESXi Shell or you can also perform this operation remotely using the vSphere API such as leveraging PowerCLI as an example.
Startup VSAN Cluster
Step 1 - Power on all the ESXi hosts that is part of the VSAN Cluster.
Step 2 - Once all the ESXi hosts have been powered on, you can then login to the ESXi host that contains your vCenter Server. If you took my advice earlier from the shutdown procedure, then you can login to the first ESXi host and power on your vCenter Server VM.
Note: You can perform steps 2-4 using the vSphere C# Client but you can also do this using either the API or simply calling vim-cmd from the ESXi Shell. To use vim-cmd, you need to first search for the vCenter Server VM by running the following command:
vim-cmd vmsvc/getallvms
You will need to make a note of the Vmid and in this example, our vCenter Server has Vmid of 6
Step 3 - To power on the VM, you can run the following command and specify the Vmid:
vim-cmd vmsvc/power.on [VMID]
Step 4 - If you would like to know when the vCenter Server is ready, you can check the status of VMware Tools as that should give you an indication that system is up and running. To do so, you can run the following command and look for the VMware Tools status:
vim-cmd vmsvc/get.guest [VMID]
Step 5 - At this point, you can now login to the vSphere Web Client and take all of your ESXi hosts out of Maintenance Mode and then power on the rest of your VMs.
As you can see the process to shutdown an entire VSAN Cluster even with vCenter Server running on the VSAN Datastore is fairly straight forward. Once you are comfortable with the procedure, you can even automate this entire process using the vSphere API/CLI, so you do not even need a GUI to perform these steps. This might even be a good idea if you are monitoring a UPS and have an automated way of sending remote commands to shutdown your infrastructure.
Thanks William - this is clear and succinct. I have a few customers building mgmt clusters with VSAN and I will share this post with them. Thanks for all of the great work that you do for the community!
Ran into a issue here on new prod build.
So down was fine but back up had issues. Running vSAN with vCenter (windows) virtual on cluster with local DB (express) and vCenter uses a AD user to start services which is also on vSAN cluster. 4 ESXi nodes all VSAN participants, no physical boxes.
When booted both AD and vCenter NIC's where now not checked in ESXi C# client to be connected for some reason so they could not communicate and had errors on enabling. There interfaces are on a VMware distributed switch which appears you can just see the assigned network on a C# to the host and have to use C# to vCenter to actually change networks. We had to build a standard switch to enable the interfaces, move them onto it so that the 2 VM's could communicate. Then vCenter could authenticate and start. Where able to move them back to distributed switch and power everything else on but was bit harry for a bit.
Still coming up with some idea's should we test or need to shutdown the whole cluster ever again.
- use redundant DATA VLAN ports on hosts spare onboard 1GB ports not in DVS?
- don't have vCenter AD auth?
- dedicate a host to AD & vCenter with local storage?
- make AD & vCenter physical?
Thanks for the info, love the site!
Cory
Hey William,
What if your vCenter appliance is running on a distributed switch? I tried this procecure in my test lab and I was able to power on the VCSA but it would not come on the network. Looking at the settings for the VM, the NIC was not connected. When I try to check the box to connect it and hit ok, I get an error that says "Invalid configuration for device '0" and it won't connect it. The only work around that I found was to remove one of the NICs from the distributed switch and create a standard switch with it, create the port group and add the VLAN and then move the VCSA to that standard switch. Then I could connect the NIC and the VCSA came online. After that, I could get back to the web client and move the VCSA back to the distributed switch and then move the NIC back to the distributed switch as well and all was good. Kind of a pain though.
Any thoughts? Is this just the way it is?
This VMware tech note (which looks suspiciously like your blog post on the topic) seems to indicate that is the only work around:
http://www.vmware.com/files/pdf/products/vsan/VMware-TechNote-Bootstrapping-VSAN-without-vCenter.pdf
Thanks,
Bob
Thanks for the update for VSAN 6.0, William. I'm the one that's been working with vmWare support, who then turned to you for answers for this. Simpler process, really, but I was surprised with this difference between VSAN 5.5 and 6.
Noticed that it should be
esxcli system maintenanceMode set -e true -m noAction
Instead of
esxcli system maintenanceMode -e true set -m noAction
though.
Thanks for the catching the typo! I've fixed the article
Thanks for the 6.0 update William, and why it's not the same way as 5.5 BTW ?
Thank you for the greate article.
Do you have to plan to write this procedure for the document?
For example, VMware Virtual SAN 6.x Documentation, VMware Knowledge Base, White Paper, and others.
I hope that it is written an official document, because it is very important procedure to vCenter Server on vsanDatastore.
Thanks for this guideline.
I am using a two node vSAN (plus witness and vCenter on another non-vSAN host) and want to automatize this shutdown procedure with PowerCLI.
How can I put my vSAN Hosts into maintenance mode with "NoAction" argument?
I tried esxcli though PowerCLI without any success.
Thank you by advance.
Jonathan Fermé
Hi Jonathan, here it is: http://www.hypervisor.fr/?p=5581
Hi, If I just want restart vCenter Server, what implications are in vSAN cluster?