vSphere HA and NFS Datastores

Recently during a vCloud Director project we were testing how long it  takes to recover from an HA event. The test was done on two node management cluster where we loaded one host with almost all of the management VMs and then shut it down and measured how long it takes all the affected services to recover. This exercise was done to see if we can fulfill the required SLA.

The expectation was that it will take about 20 second for the other host to find out the first one is dead and then it will start to power up all the VMs based on their restart priority. Database server and Domain Controller have high priority, the rest of the VMs had the default one. To our surprise it did not took 20 seconds or so, but 8:40 minutes to register the database server and start the boot procedure. For some reason the particular server was shown with 95% Power On status. Although there are books written about vSphere HA, this behaviour was not explained.

See the Start and Completed Times:

At first it looked like a bug so SR was raised but then we found out it is like that by design. We were using NFS storage and NFS locking is influencing how long it takes to release the locks on VMs vmdk and vswp files. KB article 1007909 states that the time to recover the lock on NFS storage can be calculated:

X = (NFS.DiskFileLockUpdateFreq * NFS.LockRenewMaxFailureNumber) + NFS.LockUpdateTimeout

which with default values is

X = (10 * 3) + 5 = 35 seconds.

However the database server had 12 vmdk disks (2 disks per database) and the restart actually took (12+1)*(35+5) = 8:40 minutes. It means the locks were released sequentially, additional 5 seconds was added to each and also the VM swap file lock had to be released. This is expected behavior for vSphere 5.0 and older. The newly release vSphere 5.1 lowers the time down to 2 minutes as there are 5 threads (main vmx + 4 worker threads) working in parallel and those 13 files can be released in 3 takes.

KB article 2034935 was written about this behaviour.

If this is by design what can you do to avoid it?

1. Upgrade to ESX 5.1 to get up to 5 time faster HA restart times

2. Use block storage instead of NFS

3. Tweak NFS advanced parameters (DiskFileLockUpdateFreq, LockRenewMaxFailureNumber, LockUpdateTimeout) – however this is not recommended

4. Do not use that many VMDKs. Either consolidate on smaller number of disks, or use in guest disk mapping (iSCSI, NFS)

5. Just accept it when you calculate your SLAs.

Iomega VMware ESX NFS Datastore Issue

In my VMware vSphere home lab I have been using for shared storage various hardware or software appliances: from Openfiler, Falconstor VSA, HP LeftHand/StorageWorks P4000 VSA to EMC Celerra VSA. Recently I have added Iomega ix4-200d. Its NFS sharing is VMware vSphere certified. Although Iomega is not very powerfull (see my previous blog post about Iomega) I moved all my VMs to it to free up my storage server to play with other storage appliances (I am testing Nexenta now, but that is for another blog post).

My setup is now very simple. I have diskless ESXi that runs all the VMs from the NFS datastore served by Iomega. Today I have restarted the ESXi server and was surprised that due to inaccessible NFS datastore no VM was started.  The datastore was grayed out in the vSphere Client GUI.

I have virtual domain controller, internet firewall/router, mail server and some other less important machines. So if the ESX does not start properly I have no internet, email and I cannot even log in to Iomega CIFS shares because it is joined to domain which was also not available.
I was very surprised as I had no idea why the ESX server could not connect to the NFS datastore. Storege rescan did not help, so I have unmounted the datastore and tried to reconnect it. I received this error message:

Call “HostDatastoreSystem.CreateNasDatastore” for object “ha-datastoresystem” on ESXi “” failed.
Operation failed, diagnostics report: Unable to complete Sysinfo operation. Please see the VMkernel log file for more details.

VMkernel log (which is on ESXi stored in /var/log/messages) did not help much:

Jan 15 22:10:25 vmkernel: 0:00:01:30.282 cpu0:4767)WARNING: NFS: 946: MOUNT RPC failed with RPC status 13 (RPC was aborted due to timeout) trying to mount Server ( Path (/nfs/Iomega)

I was able to connect to the NFS Iomega export from regular linux machine. I was also able to connect the ESX server to regular linux NFS export. And that helped me to find the solution.

Because both of my DNS servers were running in virtual machines and not accessible, Iomega took more time to connect the ESX server to the NFS datastore and ESX server meanwhile gave up. The remedy was very simple. To Iomega /etc/hosts file I have added a line with the ESX server IP address and its hostname. This must be done via Iomega ssh console and not via web GUI:

root@Iomega:/# cat /etc/hosts localhost.localdomain localhost Iomega.FOJTA.COM Iomega Iomega.FOJTA.COM Iomega esx2.fojta.com esx2

From now when the ESX server reboots it mounts the NFS datastore immediately.