VMware vSphere HA heartbeat datastores, the isolation address and vSAN
I’ve written about vSAN and vSphere HA various times, but I don’t think this has been explicitly called out before. Cormac and I were doing some tests this week and noticed something. When we were looking at results I realized I described it in my HA book a long time ago, but it is so far hidden away that probably no one has noticed.
In a traditional environment when you enable HA you will automatically have HA heartbeat datastores selected. These heartbeat datastores are used by the HA master host to determine what has happened to a host which is no longer reachable over the management network. In other words, when a host is isolated it will communicate this to the HA master using the heartbeat datastores. It will also inform the HA master which VMs were powered off as the result of this isolation event (or not powered off when the isolation response is not configured).
Now, with vSAN the management network is not used for communication between the hosts but the vSAN network is used. Typically in a vSAN environment there’s only vSAN storage so there are no heartbeat datastores. As such, when a host is isolated it is not possible to communicate this to the HA master. Remember, the network is down and there is no access to the vSAN datastore so the host cannot communicate through that either. HA will still function as expected though. You can set the isolation response to power-off and then the VMs will be killed and restarted. That is, if isolation is declared.
So when is isolation declared? A host declares itself isolated when:
- It is not receiving any communication from the master
- It cannot ping the isolation address
Now, if you have not set any advanced settings then the default gateway of the management network will be the isolation address. Just imagine your vSAN Network to be isolated on a given host, but for whatever reason the Management Network is not. In that scenario isolation is not declared, the host can still ping the isolation address using the management network vmkernel interface. HOWEVER… vSphere HA will restart the VMs. The VMs have lost access to disk, as such the lock on the VMDK is lost. HA notices the hosts are gone, which must mean that the VMs are dead as the locks are lost, lets restart them.
Read the entire article here, vSphere HA heartbeat datastores, the isolation address and vSAN
Via the fine folks at VMware!