New Networking Features in VMware Cloud Director 10.4.1

The recently released new version of VMware Cloud Director 10.4.1 brings quite a lot of new features. In this article I want to focus on those related to networking.

External Networks on Org VDC Gateway (NSX-T Tier-1 GW)

External networks that are NSX-T Segment backed (VLAN or overlay) can now be connected directly to Org VDC Edge Gateway and not routed through Tier-0 or VRF the Org VDC GW is connected to. This connection is done via the service interface (aka Centralize Service Port – CSP) on the service node of the Tier-1 GW that is backing the Org VDC Edge GW. The Org VDC Edge GW still needs a parent Tier-0/VRF (now called Provider Gateway) although it can be disconnected from it.

What are some of the use cases for such direct connection of the external network?

  • routed connectivity via dedicated VLAN to tenant’s co-location physical servers
  • transit connectivity across multiple Org VDC Edge Gateways to route between different Org VDCs
  • service networks
  • MPLS connectivity to direct connect while internet is accessible via shared provider gateway

The connection is configured by the system administrator. It can use only single subnet from which multiple IPs can be allocated to the Org VDC Edge GW. One is configured directly on the interface while the others are just routed through when used for NAT or other services.

If the external network is backed by VLAN segment, it can be connected to only one Org VDC Edge GW. This is due to a NSX-T limitation that a particular edge node can use VLAN segment only for single logical router instantiated on the edge node – and as VCD does not give you the ability to select particular edge nodes for instantiation of Org VDC Edge Tier-1 GWs it simply will not allow you to connect such external network to multiple Edge GWs. If you are sharing the Edge Cluster also with Tier-0 GWs make sure that they do not use the same VLAN for their uplinks too. Typically you would use VLAN for co-location or MPLS direct connect use case.

Overlay (Geneve) backed external network has no such limitations which makes it a great use case for connectivity across multiple GWs for transits or service networks.

NSX-T Tier-1 GW does not provide any dynamic routing capabilities, so routing to such network can be configured only via static routes. Also note that Tier-1 GW has default route (0.0.0.0/0) always pointing towards its parent Tier-0/VRF GW. Therefore if you want to set default route to the segment backed external network you need to use two more specific routes. For example:

0.0.0.0/1 next hop <MPLS router IP> scope <external network>
128.0.0.0/1 next hop <MPLS router IP> scope <external network>

Slightly related to this feature is the ability to scope gateway firewall rules to a particular interface. This is done via Applied To field:

You can select any CSP interface on the Tier-1 GW (used by the external network or non-distributed Org VDC networks) or nothing, which means the rule will be applied to uplink to Tier-0/VRF as well as to any CSP interface.

IP Spaces

IP Spaces is a big feature that will be delivered across multiple releases, where 10.4.1 is the first one. In short IP Spaces are new VCD object that allows managing individual IPs (floating IPs) and subnets (prefixes) independently across multiple networks/gateways. The main goal is to simplify the management of public IPs or prefixes for the provider as well as private subnets for the tenants, however additional benefits are being able to route on shared Provider Gateway (Tier-0/VRF), use same dedicated parent Tier-0/VRF for multiple Org VDC Edge GWs or the ability to re-connect Org VDC Edge GWs to a differeent parent Provider Gateway.

Use cases for IP Spaces:

  • Self-service for request of IPs/prefixes
  • routed public subnet for Org VDC network on shared Provider Gateway (DMZ use case, where VMs get public IPs with no NATing performed)
  • IPv6 routed networks on a shared Provider Gateway
  • tenant dedicated Provider Gateway used by multiple tenant Org VDC Edge Gateways
  • simplified management of public IP addresses across multiple Provider Gateways (shared or dedicated) all connected to the same network (internet)

In the legacy way the system administrator would create subnets at the external network / provider gateway level and then add static IP pools from those subnet for VCD to use. IPs from those IP pools would be allocated to tenant Org VDC Edge Gateways. The IP Spaces mechanism creates standalone IP Spaces (which are collections of IP ranges (e.g. 192.168.111.128-192.168.111.255) and blocks of IP Prefixes(2 blocks of 192.168.111.0/28 – 192.168.111.0/28, 192.168.111.16/28 and 1 block of 192.168.111.32/27).

A particular IP Space is then assigned to a Provider Gateway (NSX-T Tier-0 or VRF GW) as IP Space Uplink:

An IP Space can be used by multiple Provider Gateways.

The tenant Org VDC Edge Gateway connected to such IP Space enabled provider gateway can then request floating IPs (from IP ranges) 

or assign IP block to routable Org VDC network which results into route advertisement for such network.

In the above case such network should be also advertised to the internet, the parent Tier-0/VRF needs to have route advertisement manually configured (see below IP-SPACES-ROUTING policy) as VCD will not do so (contrary to NAT/LB/VPN case).

Note: The IP / IP Block assignments are done from the tenant UI. Tenant needs to have new rights added to see those features in the UI.

The number of IPs or prefixes the tenant can request is managed via quota system at the IP Space level. Note that the system administrator can always exceed the quota when doing the request on behalf of the tenant.

A provider gateway must be designated to be used for IP Spaces. So you cannot combine the legacy and IP Spaces method of managing IPs on the same provider gatway. There is currently no way of converting legacy provider gateway to the one IP Spaces enabled, but such functionality is planned for the future.

Tenant can create their own private IP Spaces which are used for their Org VDC Edge GWs (and routed Org VDC networks) implicitly enabled for IP Spaces. This simplifies the creation of new Org VDC networks where a unique prefix is automatically requested from the private IP Space. The uniqueness is important as it allows multiple Edge GWs to share same parent provider gateway.

Transparent Load Balancing

VMware Cloud Director 10.4.1 adds support for Avi (NSX ALB) Transparent Load Balancing. This allows the pool member to see the source IP of the client which might be needed for certain applications.

The actual implementation in NSX-T and Avi is fairly complicated and described in detail here: https://avinetworks.com/docs/latest/preserve-client-ip-nsxt-overlay/. This is due to the Avi service engine data path and the need to preserve the client source IP while routing the pool member return traffic back to the service engine node.

VCD will hide most of the complexity so that to actually enable the service three steps need to be taken:

  1. Transparent mode must be enabled at the Org VDC GW (Tier-1) level.

2. The pool members must be created via NSX-T Security Group (IP Set).

3. Preserve client IP must be configured on the virtual service.

Due to the data path implementation there are quite many restrictions when using the feature:

  • Avi version must be at least 21.1.4
  • the service engine group must be in A/S mode (legacy HA)
  • the LB service engine subnet must have at least 8 IPs available (/28 subnet – 2 for engines, 5 floating IPs and 1 GW) – see Floating IP explanation below
  • only in-line topology is supported. It means a client that is accessing the LB VIP and not going via T0 > T1 will not be able to access the VS
  • only IPv4 is supported
  • the pool members cannot be accessed via DNAT on the same port as the VS port
  • transparent LB should not be combined with non-transparent LB on the same GW as it will cause health monitor to fail for non-transparent LB
  • pool member NSX-T security group should not be reused for other VS on the same port

Floating IPs

When transparent load balancing is enabled the return traffic from the pool member cannot be sent directly to the client (source IP) but must go back to the service engine otherwise asymmetric routing happens and the traffic flows will be broken. This is implemented in NSX-T via N/S Service Insertion policy where the pool member (defined via security group) traffic is instead to its default GW redirected to the active engine with a floating IP. Floating IPs are from the service engine network subnet but are not from the DHCP range which assigns service engine nodes their primary IP. VCD will dedicate 5 IP from the LB Service network range for floating IPs. Note that multiple transparent VIPs on the same SEG/service network will share floating IP.

Non-elastic Allocation Pool VDC with Flex?

This came up on provider Slack and I want to preserve it for future.

Question: Can the provider create Flex type Org VDC where the tenant would get allocation of certain CPU capacity (in GHz) and they could deploy as many VMs as they wish while limited by the CPU allocation?

Example: the VDC has 10 GHz allocation and the tenant is able to deploy 3 VMs each with 2 vCPUs at 2 GHz (which is in total 12 GHz) but they will altogether be able to consume only up to 10 GHz of physical vSphere resources.

This is basically the old legacy non-elastic allocation pool type VDC where the VDC is backed by single vCenter Server Resource Pool with the limit equal to the CPU allocation. As of VCD 5.1.2 there is ability to enable elasticity of such VDC but it behaves differently and is done at whole VCD instance level. So can you create similarly behaving VDC with the new Flex type (introduced in VCD 10.0).

Answer: Yes. Create non-elastic Flex VDC with the allocation and enable CPU limit. Then create VM sizing policy with vCPU speed set to 0 GHz and make it default for the Org VDC.

Quotas and Quota Policies in VMware Cloud Director

In this article I want to highlight a new neat feature in VMware Cloud Director 10.2 – the ability to assign quotas and create quota policies.

This can be done at multiple levels both by service provider or organization administrator.

The following resources today can be managed via quotas:

  • Memory
  • CPU
  • Storage
  • All VMs (includes vApp template VMs)
  • Running VMs
  • TKG Clusters

The list might expand in the future so you can easily find what quota capabilities are available via API.

The service provider can create quotas at the organization level in the Organization > Configure > Quotas section:

The org administrator can assign quota to individual users or groups. This is done from the Administration > Access Control > User or Group  > Set Quota section.
The assignment of a quota at the group level is inherited by each group user (so it is not enforced at the aggregate group level) but can be overridden at the individual user quota level. Also if a user is member of multiple groups the least restrictive combination of participating group quotas will be applied to her.

At the same place the user or org admin can see the actual user’s usage compared to the quota.

Org admins can use quotas to easily control good behavior of org users (not running too many VMs concurrently, not consuming too much storage, etc.), while system admins can set safety quotas at org level when using Org VDC allocation models with unlimited consumption with Pay per use billing.

One hidden feature available only via API is the ability to create more generic quota policies that can combine (pool) multiple quota elements and use those to assign them to organizations, groups or individual users. Think of quota policy: Power User vs Regular User, where the former can power on more VMs.

When a specific quota is assigned at the user/group/org object, quota policy is created in the backend anyway but is specific just to the one object, while edit of Power User quota policy would be applied to every user that has such quota policy.

The feature comes with new specific rights so can be easily enabled or disabled:

  • Organization: Manage Quotas of Organization
  • Organization: Edit Quotas Policy
  • General: View Quota Policy Capabilities
  • General: Manage Quota Policy
  • General: View Quota Policy

VMware Cloud Director – Storage IOPS Management – Part II

This is a follow up to the article I posted about a year ago that describes new IOPS management functionality in VMware Cloud Director (VCD) 10.2.

Storage IOPS  is next to compute, networking and storage capacity a limited resource service providers want to manage in order to fairly share underlying physical resources in a multitenant environment.

As was described in the original article VCD supported storage IOPS management  however the feature was quite hidden and available only via API. The recent release of VMware Cloud Director not only fully exposes the functionality in the UI but also adds some new functionality. Let’s dive into it.

There are two main mechanisms now how you can manage IOPS.

vCenter Server managed IOPS

This mechanism relies on setting IOPS limits at storage policy level directly in vCenter Server. That is possible with host based and with vSAN based storage policies. This mechanism is quite simple – when a VM disk is provisioned to such IOPS limited storage policy it will inherit the IOPS limit –  a constant number per policy. You will not be able to set proportional IOPS based on disk capacity.

vSAN Storage Policy with IOPS Limit

Host Based non vSAN Storage Policy with IOPS Limit

I would recommend using such mechanism only if you want to avoid noisy neighbors. The concept is not new, VCD could use such vSAN policies for some time and host based policies were already supported in VCD 10.1. The only difference is that now in 10.2 the tenant will see the limit reservation set at VM disk level but will not be able to change it.

Non-editable Disk IOPS

VCD Managed IOPS

This is much more sophisticated mechanism where you can really manage IOPS as pool of available capacity that you slice and allocate to tenant Org VDCs. This is the mechanism that was until now only available via API.

You will start by tagging your datastores with their IOPS capacity – that has not changed and still must be done from within VC via custom properties.

At Provider VDC level you can then create IOPS managed storage policies and define their service level in terms of disk IOPS defaults, maximums or IOPS allocation based on disk size (0 means unlimited).

This storage policy configuration can be inherited or overridden at Org VDC level. This is big improvement compared to the old approach where you had to create such storage policies always at Org VDC level.

Another new thing is that you can disable IOPS placement mechanism for such storage policy. This is useful in case you want to use Datastore Clusters. VCD will no longer try to place each virtual disk based on a particular datastore available IOPS. The placement decision is instead done by vCenter Server – you should therefore enable Storage DRS with I/O balancing automation. There is no need in such case to tag individual datastores in VC with their IOPS capacity.

Some of the old caveats still apply:

  • Disk IOPS can be assigned only to regular VMs or named (independent) disks, not to VM templates.
  • The disk IOPS will be always allocated against the Org VDC storage profile even if the VM is powered-off. This means the cloud provider can oversubscribe IOPS at the provider VDC storage profile level.
  • System administrator can override IOPS limits when deploying/editing tenant VMs in the system context.

vCloud Director – Storage IOPS Management

Update 2020/10/22 Make sure to read part II for updates.

It is a little known fact that besides compute (capacity and performance), storage capacity and external network throughput rate, vCloud Director can also manage storage IOPS (input / output or read and write operations per second) performance at provisioned virtual disk granularity. This post summarizes the current capabilities.

Cloud providers usually offer different tiers of storage that is available to tenants for consumption. IOPS management helps them to differentiate these tiers and enforce the virtual disk performance based on IOPS metric. This eliminates noisy neighbor problem, but also makes both consumption and capacity management more predictive.

vCloud Director relies on vSphere to control the maximum IOPS a VM has access to on particular storage policy through a Storage I/O Control functionality which is supported on VMFS (block) and NFS datastores (no vSAN). In vSphere this is defined at virtual hard disk level, but is enforced at VM level. vSphere however does not manage available IOPS capacity of a datastore the same way it can do with compute. That’s where vCloud Director comes in.

The cloud provider first needs to create a new vSphere custom field (iopsCapacity) and use it do define for vCloud Director managed datastore their IOPS capacity. This is done via vCenter Managed Browser Object UI and is described in KB 2148300.

Definition of Custom Field iopsCapacity in vCenter MOB UI

Configuring datastore IOPS capacity in vCenter MOB UI

vCloud Director consumes vSphere datastores through storage policies. In my case I have tag based storage policy named: 2_IOPS/GB and as the name suggests the intention is to provide two provisioned IOPS per each GB of capacity. 40 GB hard disk thus should provide 80 IOPS.

Once the storage policy is synced with vCloud Director we can add it to a Provider VDC and consume it in its Org VDCs. vCloud Director will keep track of the storage policy IOPS capacity and how much has been allocated. That information is available with vCloud API when retrieving the Provider VDC storage profile representation:

Note that the pvdcStorageProfile IopsCapacity is the total IopsCapacity for all datastores as tagged in vCenter belonging to the storage policy.

The actual definition of storage policy parameters is done via PUT call at Org VDC level again with API on the Org VDC storage profile representation. The cloud provider supplies IopsSetting element that consists of the following parameters:

  • Enabled: True if this storage profile is IOPS-based placement enabled.
  • DiskIopsMax: the max IOPS that can be given to any disk (value 0 means unlimited)
  • DiskIopsDefault: the default IOPS given to any/all disks associated with this VdcStorageProfile if user doesn’t specify one
  • StorageProfileIopsLimit: the max IOPS that can be used by this VdcStorageProfile. In other words: maximum IOPS that can be assigned across all disks associated with this VdcStorageProfile (use 0 for unlimited).
  • DiskIopsPerGbMax: similar to DiskIopsMax but instead of a specific value, it’s the ratio of size (in GB) to IOPS. if set to 1, then a 1 GB disk is limited to 1 IOPS, if set to 10, then a 1 GB disk is limited to 10 IOPS, etc.

When a user deploys a VM utilizing IOPS enabled storage policy she can set specific requested IOPS for each disk though API (0 is treated as unlimited), or set nothing and vCloud Director will set default limit based on DiskIopsDefault or DiskIopsPerGbMax x DiskSizeInGb value, whichever is lower. The requested value must always be smaller than DiskIopsMax and also smaller than DiskIopsPerGbMax x DiskSizeInGb. The DiskIopsMax and DiskIopsDefault values must also be lower that StorageProfileIopsLimit.

In my case I wanted always to set IOPS limit to 2 IOPS per GB, so I configured Org VDC storage policy in the following way:

And this is provisioned VM as seen in vCloud Director UI

and in vCenter UI.

Additional observations:

  • Datastore clusters cannot be used together with IOPS storage policies. The reason is that when datastore clusters are used it is vCenter who is responsible for placing the disk to a specific datastore and as mentioned above, vCenter does not track IOPS capacity at datastore level, whereas the vCloud Director placement engine will take into account both the datastore capacity (GB) and IOPS capacity when finding the suitable datastore for a disk.
  • vSAN is not supported as it does not support SIOC. vSAN advanced storage policies allow specifying IOPS limits per object and can be used instead.
  • Disk IOPS can be assigned only to regular VMs, not to VM templates.
  • The disk IOPS will be always allocated against the Org VDC storage profile even if the VM is powered-off. This means the cloud provider can oversubscribe IOPS at the provider VDC storage profile level.
  • System administrator can override IOPS limits when deploying/editing tenant VMs in the system context.
  • Some vCloud Director versions have bug where the UI sends 0 (unlimited) IOPS for disk instead of null (undefined) which might result in provisioning error if it is not compliant with the policy limit.