VMware Cloud Director – Storage IOPS Management – Part II

This is a follow up to the article I posted about a year ago that describes new IOPS management functionality in VMware Cloud Director (VCD) 10.2.

Storage IOPS  is next to compute, networking and storage capacity a limited resource service providers want to manage in order to fairly share underlying physical resources in a multitenant environment.

As was described in the original article VCD supported storage IOPS management  however the feature was quite hidden and available only via API. The recent release of VMware Cloud Director not only fully exposes the functionality in the UI but also adds some new functionality. Let’s dive into it.

There are two main mechanisms now how you can manage IOPS.

vCenter Server managed IOPS

This mechanism relies on setting IOPS limits at storage policy level directly in vCenter Server. That is possible with host based and with vSAN based storage policies. This mechanism is quite simple – when a VM disk is provisioned to such IOPS limited storage policy it will inherit the IOPS limit –  a constant number per policy. You will not be able to set proportional IOPS based on disk capacity.

vSAN Storage Policy with IOPS Limit
Host Based non vSAN Storage Policy with IOPS Limit

I would recommend using such mechanism only if you want to avoid noisy neighbors. The concept is not new, VCD could use such vSAN policies for some time and host based policies were already supported in VCD 10.1. The only difference is that now in 10.2 the tenant will see the limit reservation set at VM disk level but will not be able to change it.

Non-editable Disk IOPS

VCD Managed IOPS

This is much more sophisticated mechanism where you can really manage IOPS as pool of available capacity that you slice and allocate to tenant Org VDCs. This is the mechanism that was until now only available via API.

You will start by tagging your datastores with their IOPS capacity – that has not changed and still must be done from within VC via custom properties.

At Provider VDC level you can then create IOPS managed storage policies and define their service level in terms of disk IOPS defaults, maximums or IOPS allocation based on disk size (0 means unlimited).

This storage policy configuration can be inherited or overridden at Org VDC level. This is big improvement compared to the old approach where you had to create such storage policies always at Org VDC level.

Another new thing is that you can disable IOPS placement mechanism for such storage policy. This is useful in case you want to use Datastore Clusters. VCD will no longer try to place each virtual disk based on a particular datastore available IOPS. The placement decision is instead done by vCenter Server – you should therefore enable Storage DRS with I/O balancing automation. There is no need in such case to tag individual datastores in VC with their IOPS capacity.

Some of the old caveats still apply:

  • Disk IOPS can be assigned only to regular VMs or named (independent) disks, not to VM templates.
  • The disk IOPS will be always allocated against the Org VDC storage profile even if the VM is powered-off. This means the cloud provider can oversubscribe IOPS at the provider VDC storage profile level.
  • System administrator can override IOPS limits when deploying/editing tenant VMs in the system context.

Understanding Datastore Metrics in vCloud Director

I got question from a customer what is the meaning of the various datastore metrics in vCloud Director: used, provisioned and requested storage. These can be viewed at the Datastores & Datastores Cluster screen in the Manager & Monitor tab.

Datastore list

There are even more metrics when we open properties of a single datastore or a datastore cluster.

Datastore Properties

So let us go through them:

Total Capacity: Total size of the datastore or datastore cluster as reported by vCenter Server

Used Capacity: Actual used capacity of the datastore or the datastore cluster as reported by vCenter Server.

Available Capacity: Difference between the values above (Available Capacity = Total Capacity – Used Capacity)

I am including screenshot from vCenter Server of the same datastore:

Datastore Properties in vCenter

Provisioned Capacity: Total storage provisioned of virtual machines residing on the datastore. This number might be much bigger than the actual datastore capacity if thin provisioning is used. Again, this number is reported by vCenter Server (can be seen in the Datastore and Datastore Clusters view on the Summary tab). I am again including relevant vCenter Server screenshot.

vCenter Provisioned

Requested Storage: This is the only metric coming directly from vCloud Director. It adds up storage for all vCloud Director provisioned virtual machines, catalog templates and media and vCNS (vShield) Edges. It uses allocated storage and also includes (even potential) memory swap for virtual machines (not for catalog VMs). It does not include storage occupied by shadow VMs or intermediate disks in a linked clone tree.

Fast Provisioning Enhancement in vCloud Director 5.5

Fast provisioning was introduced with vCloud Director 1.5 and enables speeding up a cloning process when deploying vApps from catalog or copying VMs. It utilizes vSphere linked clones where the base image is not cloned, instead a delta disk is created to record changed blocks.

vCloud Director 5.1 enabled fast provisioning support of VAAI accelerated hardware clones with NFS arrays that offered this feature. I described in detail how the feature works with NetApp flex clones here. Where linked clones bring performance impact on reads due to the need of traversing the chain of delta disks to find the right block and also due to problem that the delta disk is not properly aligned, hardware clones (depending on storage array implementation) might have no such performance impact.

One of the main characteristic of linked clones (vSphere native or hardware accelerated) is the need to have the base disk on the same datastore as the delta disks. vCloud Director placement engine always prefers to place the fast provisioned clone on the datastore with base image, however there might be situations when it is not possible. The datastore might be full (red threshold), the target VM should be placed to different storage profile (renamed to storage policy in 5.5), to a cluster that does not have access to the base image catalog or to a different vCenter Server. In such cases if the original VM was a catalog template, a shadow VM was fully cloned on the target datastore first and then linked clone was created.

Catalog

The problem with this approach is that where linked clone creation takes seconds, shadow VM creation could take minutes. This leads to inconsistent quality of service to the end-users. I have always recommended to service providers to ‘pre-warm’ datastores by manually forcing creation of shadow VMs for their catalog images. The good news is that now in vCloud Director 5.5 there is a hidden advanced config which will do this for you automatically.

Following entry must be added to the config table in vCloud Director database: valc.catalog.fastProvisioning=true. This can be accomplished with this SQL statement:

INSERT into [vcloud].[dbo].[config] (cat,name,value)
VALUES (‘vcloud’,’valc.catalog.fastProvisioning’,’true’);

From now on vCloud Director will once a day (every 24 hours from its first start) check status of all catalog templates and will proactively create shadow VMs for each combination of storage policy and cluster. It will make sure that there is at least one datastore with enough space (below yellow threshold) with the shadow VM in each possible storage policy and cluster combination so any provisioning operation will be very fast.

Shadow VMs created on two additional datastores which are members of two different storage policies
Shadow VMs created on two additional datastores which are members of two different storage policies

Some additional notes:

  • It is all or nothing feature. It is not possible to selectively enable only for certain templates.
  • To use this feature, database editing is required. Generally unless a database edit is documented in official VMware source (documentation, KB or support request) such action should be considered as unsupported.

Datastore Migrations in vCloud Director – Part 3 of 3: Datastore Clusters

This article is the last in the series.

Part 1: Introduction
Part 2: Individual Datastores
Part 3: Datastore Clusters

In this scenario we are exposing to vCloud Director only datastore clusters. vCloud Director cannot disable individual datastores, on the other hand we could leverage SDRS Maintenance Mode. We can also break the cluster and remove the old datastore from it and basically be in the Scenario 1 but there is no fun in that.

To leverage SDRS Maintenance Mode we need to add the new datastores to the Datastore Cluster (beware of limit of maximum of 32 datastores per cluster). Note: it is not possible to mix NFS and block based datastores. Obviously Storage DRS must be enabled (manual mode is enough).

1. When we put the old datastore to SDRS Maintenance Mode following will happen: vCloud Director will not deploy any new VMs or store media on it.

2. Storage DRS will either automatically or will wait for confirmation (if in manual mode) migrate all VMs away from the datastore. As of vSphere 5.1 this also supports linked clones (vCloud Director Fast Provisioning). Meaning that it will not inflate a linked clone, instead it will try to first migrate the link clone to a datastore with shadow VM (copy of the base image). If there is none, it will first copy the base image and then create linked clone. This has some limitation though. If we are migrating VM2 and VM3 which are linked clones of VM1 at the same time we will have two times full clone of VM1 and then from each linked clone VM2 and VM3 respectively. Therefore I would advise to limit number of concurrent Storage vMotion operations per datastore – see here.

All regular VMs, catalog VMs, expired VMs and vShield Edges will be migrated.

Note: If VAAI accelerated Fast Provisioning is used (hardware snapshots) this will not work!!! As mentioned in Part 1 vSphere 5.1 does not support vMotion of hardware clones.

3. Catalog media files will not be migrated as they are not vSphere objects. Here we need to use the same procedure as mentioned in the Scenario 1, step 3.

In my opinion there is strong case for using Datastore Clusters with vCloud Director to simplify vCloud storage management. The important consideration is on which level we want to do the management – vSphere of vCloud? Depending on the internal politics and separation of duties this might be the decisive factor.

Datastore Migrations in vCloud Director – Part 2 of 3: Individual Datastores

This article is 2nd part of the series.

Part 1: Introduction
Part 2: Individual Datastores
Part 3: Datastore Clusters

In this scenario we are presenting to vCloud Director individual datastores, not datastore clusters. New datastores were provisioned with the same Storage Capability and to the same hosts as the old ones which we want to evacuate and subsequently remove both from vCloud Director and vSphere.

1. Disable the old datastores in vCloud Director (System > Manage and Monitor > vSphere Resources > Datastores & Datastore Clusters).

2a. To relocate a VM its storage profile needs to be updated. Even if the storage profile remains the same when it is updated vCloud Director will verify that current VM location complies with given storage profile. If it is on disabled datastore it will move it to a different datastore of the same storage profile. This can be accomplished either from GUI or with vCloud API.

a. GUI – deployed VM: Open VM Properties, scroll down, change the Storage Profile to another and then back again. This will activate the OK button. Obviously we need at least two Storage Profile assigned to the Org VDC.
VM Storage Profile

b. GUI – VM in catalog: The process is almost the same. Open the VM properties in the catalog and assign the default storage profile and click OK button.
Catalog VM Storage Profile

c. GUI – Expired VM: For expired regular or catalog VM it is not possible to update storage profile from within the GUI.

d. API: The process for regular, catalog or expired VMs is identical. Retrieve the Vm element with GET request and then update the Vm with PUT request with identical body. http://pubs.vmware.com/vcd-51/topic/com.vmware.vcloud.api.doc_51/GUID-E56EEEEA-EDED-4EBC-8095-08AD2668F0D2.html

2b. As an alternative to the relocation initiated from vCloud Director we could accomplish the same with Storage vMotion at the vSphere layer. However there are same caveats:

  • Make sure that the new datastores have the correct Storage Capability and are assigned to the same hosts. There is no vCloud Director placement engine verification!
  • As stated in the Part 1 this is not recommended for Fast Provisioned VMs as they will be inflated to full clones
  • As stated in Part 1 this is not possible for VAAI Fast Provisioned VM as vSphere cannot storage vMotion them
  • You will need to manually load balance the VMs based on their size and available disk space of the new datastores

3. Other objects that can be stored on datastore we need to evacuate are vCloud Director catalog media (ISO and FLP files). These files are stored in special directory with following name structure: <VCD Instance Name>/media/<org ID>/<org VDC ID>/media-<media ID>.<ISO|FLP>.

a. GUI: In the catalog find the media object, right click and select Move to catalog. Pick the same VDC it resides in currently and select storage profile: Same As Source. Note: you will see vCenter Copy task right away; the deletion of the source media does not happen immediately upon the file copy completion, but can take a few minutes as it is asynchronous cleanup operation.
Catalog Media

b. API: The process is identical to the Storage Profile update of a VM. Retrieve the Media element with GET request and update it with PUT request with identical body.
There is no other way to move media files. Manual file transfer will make the media unavailable to vCloud Director as it references the target datastore in its database.

4. Last vCloud Director objects stored on to be evacuated datastore might be vShield Edges. The current (VCD 5.1.2) behavior is that Edge can be deployed to any datastore of the Provider VDC the Org VDC resides in. This is not very intuitive. On top of that it can end up even on disabled datastore, which is a bug and is going to be fixed in the next bug fix release. This means that redeploy from vCloud Director does not help. We need to either Storage vMotion it from vSphere to the new datastore or redeploy it from vShield Manager while specifying the datastore placement for the Edge appliance.

Edge

The Part 3 of the series will discuss if usage of Datastore Clusters and Storage DRS might simplify the process.

Go to part 3.