vSAN File Services with vCloud Director

vSphere 7 is now generally available and with it came also new vSAN update that introduces vSAN File Service. Cormac Hogan has good overview of the feature on his blog so definitely head there first to understand what it is.

I want to dive into the possibility of using vSAN File Service NFS in vCloud Director environments.

Let me start with current (April 2020) interop – vSphere 7 is not supported yet with vCloud Director. Which means vCenter Server 7 cannot be used as a target for IaaS services. But that is not an issue for the use case I want to discuss today.

vCloud Director deployment needs NFS storage for its Transfer Share directory. vCloud Director architecture consists of multiple cells (VM nodes) that scale out horizontally based on the size of the managed environment. The cells need shared database and shared Transfer Share directory to function properly. The Transfer Share must be NFS mount and is used mostly for OVF import/export operations related to vApp template and catalog management however the appliance deployment mode of vCloud Director also uses transfer share for storing appliance related info, ssh keys, responses.properties file for deployment of additional cells, and embedded database backups.

vCloud Director cell VMs are usually deployed in the management cluster and that can be separate vSphere 7 environment with vSAN. Can we (or should we) use vSAN NFS for vCloud Director Transfer Share?

Current practice is either to use external hardware storage NFS (NetApp) or to deploy Linux VM with large disk that acts as NFS server. The first approach is not always possible especially if you use vSAN only and have no external storage available. Then you have to go with the Linux VM approach. But not anymore.

 

vSAN File Service NFS has the following advantages:

  • no external dependency on hardware storage or Linux VM
  • easy to deploy and manage from UI or programmatically
  • capacity management with quotas and thresholds
  • high availability
  • integrated lifecycle

The whole end-to-end deployment is indeed very simple, let me demonstrate the whole process:

  1. Start with vSAN FS configuration in vSphere Cluster > Configure > vSAN > Services > File Service
  2. Directly download vSAN File service agent (the lightweight container image OVA)
  3. Configure vSAN domain and networking

  4. Provide pool of IP addresses for the containers (I used 4 as I have 4 host management cluster).
  5. After while you will see the agent containers deployed on each cluster node.
  6. Now we can proceed with NFS share configuration. In the vSphere Cluster > Configure > vSAN > File Service Shares > ADD. We can define name, vSAN storage policy and quotas.
  7. Enter IP addresses of your vCloud Director cells to grant them access to this share. Set permission Read/Write and make sure root squash is disabled.
  8. Once the share is created, select the check box and copy its URL. Chose the NFSv4.1.
    In my case it looks like 192.168.110.181:/vsanfs/VCDTransferShare
  9. Now you use the string in your vCloud Director cell deployment. I am using the vCloud Director appliance.
  10. Once the cell is started we can see how the transfer share is mounted:

    Notice that while the mount IP address in /etc/fstab is the provided one 192.168.110.171, the actual one used is 192.168.110.172. This provides load balancing across all service node when more exports are created and when NFSv4.1 mount address is used.

It is imported to understand that although we have 4 vSAN FS agents deployed, the TransferShare will be provided via single container – in my case it is the one with IP 192.168.110.172. To find out on which host this particular container is running go to Cluster > Monitor > vSAN > Skyline Health > File Service > File Service Health.

So what happens if the host esx-03a.corp.local is unavailable? The share will fail over to another host. This took in my tests around 60-90 seconds. During that time the NFS share will not be accessible but the mount should persist and once the failover finishes it will become accessible again.

Notice that 192.168.110.172 is now served from esx-04.corp.local.

Also note that maintenance mode of the host will not vMotion the agent. It will just shut it down (and after while undeploy) and rely on the above mechanism to fail over the share to another agent. You should never treat the vSAN FS agents as regular VMs.

I am excited to see vSAN File Services as another piece of VMware technology that removes 3rd party dependencies from running vCloud Director (as was the case with NSX load balancing and PostgreSQL embedded database).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.