Provider Networking in VMware Cloud Director

This is going to be a bit longer than usual and more of a summary / design option type blog post where I want to discuss provider networking in VMware Cloud Director (VCD). By provider networking I mean the part that must be set up by the service provider and that is then consumed by tenants through their Org VDC networking and Org VDC Edge Gateways.

With the introduction of NSX-T we also need to dive into the differences between NSX-V and NSX-T integration in VCD.

Note: The article is applicable to VMware Cloud Director 10.2 release. Each VCD release is adding new network related functionality.

Provider Virtual Datacenters

Provider Virtual Datacenter (PVDC) is the main object that provides compute, networking and storage resources for tenant Organization Virtual Datacenters (Org VDCs). When a PVDC is created it is backed by vSphere clusters that should be prepared for NSX-V or NSX-T. Also during the PVDC creation the service provider must select which Network Pool is going to be used – VXLAN backed (NSX-V) or Geneve backed (NSX-T). PVDC thus can be backed by either NSX-V or NSX-T, not both at the same time or none at all and the backing cannot be changed after the fact.

Network Pool

Speaking of Network Pools – they are used to create on-demand routed/isolated networks by tenants. The Network Pools are independent from PVDCs, can be shared across multiple PVDCs (of the same backing type). There is an option to automatically create VXLAN network pool with PVDC creation but I would recommend against using that as you lose the ability to manage the transport zone backing the pool on your own. VLAN backed network pool can still be created but can be used only in PVDC backed by NSX-V (same for very legacy port group backed network pool now available only via API). Individual Org VDCs can (optionally) override the Network Pool assigned of its parent PVDC.

External Networks

Deploying virtual machines without the ability to connect to them via network is not that usefull. External networks are VCD objects that allow the Org VDC Edge Gateways connect to and thus reach the outside world – internet, dedicated direct connections or provider’s service area. External network have associated one or more subnets and IP pools that VCD manages and uses them to allocate external IP addresses to connected Org VDC Edge Gateways.

There is a major difference how external networks are created for NSX-V backed PVDCs and for NSX-T ones.

Port Group Backed External Network

As the name suggest these networks are backed by an existing vCenter port group (or multiple port groups) that must be created upfront and is usually backed by VLAN (but could be a VXLAN port group as well). These external networks are (currently) supported only in NSX-V backed PVDCs. Org VDC Edge Gateway connected to this network is represented by NSX-V Edge Service Gateway (ESG) with uplink in this port group. The uplinks have assigned IP address(es) of the allocated external IPs.

Directly connected Org VDC network connected to the external network can also be created (only by the provider) and VMs connected to such network have uplink in the port group.

Tier-0 Router Backed External Network

These networks are backed by an existing NSX-T Tier-0 Gateway or Tier-0 VRF (note that if you import to VCD Tier-0 VRF you can no longer import its parent Tier-0 and vice versa). The Tier-0/VRF must be created upfront by the provider with correct uplinks and routing configuration.

Only Org VDC Edge Gateways from NSX-T backed PVDC can be connected to such external network and they are going to be backed by a Tier-1 Gateway. The Tier-1 – Tier-0/VRF transit network is autoplumbed by NSX-T using 100.64.0.0/16 subnet. The allocated external network IPs are not explicitly assigned to any Tier-1 interface. Instead when a service (NAT, VPN, Load Balancer) on the Org VDC Edge Gateway starts using assigned external address, it will be advertised by the Tier-1 GW to the linked Tier-0 GW.

There are two main design options for the Tier-0/VRF.

The recommended option is to configure BGP on the Tier-0/VRF uplinks with upstream physical routers. The uplinks are just redundant point-to-point transits. IPs assigned from any external network subnet will be automatically advertised (when used) via BGP upstream. When provider runs out of public IPs you just assign additional subnet. This makes this design very flexible, scalable and relatively simple.

Tier-0/VRF with BGP

An alternative is to use design that is similar to the NSX-V port group approach, where Tier-0 uplinks are directly connected to the external subnet port group. This can be useful when transitioning from NSX-V to T where there is a need to retain routability between NSX-V ESGs and NSX-T Tier-1 GWs on the same external network.

The picure below shows that the Tier-0/VRF has uplinks directly connected to the external network and a static route towards the internet. The Tier-0 will proxy ARP requests for external IPs that are allocated and used by connected Tier-1 GWs.

Tier-0 with Proxy ARP

The disadvantage of this option is that you waste public IP addresses for T0 uplink and router interfaces for each subnet you assign.

Note: Proxy ARP is supported only if the Tier-0/VRF is in Active/Standby mode.

Tenant Dedicated External Network

If the tenant requires direct link via MPLS or a similar technology this is accomplished by creating tenant dedicated external network. With NSX-V backed Org VDC this is represented by a dedicated VLAN backed port group, with NSX-T backed Org VDC it would be a dedicated Tier-0/VRF. Both will provide connectivity to the MPLS router. With NSX-V the ESG would run BGP, with NSX-T the BGP would have to be configured on the Tier-0. In VCD the NSX-T backed Org VDC Gateway can be explicitly enabled in the dedicated mode which gives the tenant (and also the provider) the ability to configure Tier-0 BGP.

There are seprate rights for BGP neighbor configuration and route advertisement so the provider can keep BGP neighbor configuration as provider managed setting.

Note that you can connect only one Org VDC Edge GW in the explicit dedicated mode. In case the tenant requires more Org VDC Edge GWs connected to the same (dedicated) Tier-0/VRF the provider will not enable the dedicated mode and instead will manage BGP directly in NSX-T (as a managed service).

Often used use case is when the provider directly connects Org VDC network to such dedicated external network without using Org VDC Edge GW. This is however currently not possible to do in NSX-T backed PVDC. There instead, you will have to import Org VDC network backed by NSX-T logical segment (overlay or VLAN).

Internet with MPLS

The last case I want to describe is when the tenant wants to access both Internet and MPLS via the same Org VDC Edge GW. In NSX-V backed Org VDC this is accomplished by attaching internet and dedicated external network portgroups to the ESG uplinks and leveraging static or dynamic routing there. In an NSX-T backed Org VDC the provider will have to provision Tier-0/VRF that has transit uplink both to MPLS and Internet. External (Internet) subnet will be assigned to this Tier-0/VRF with small IP Pool for IP allocation that should not clash with any other IP Pools.

If the tenant will have route advertisement right assigned then route filter should be set on the Tier-0/VRF uplinks to allow only the correct prefixes to be advertised towards the Internet or MPLS. The route filters can be done either in NSX-T direclty or in VCD (if the Tier-0 is explicitly dedicated).

The diagram below shows example of an Org VDC that has two Org VDC Edge GWs each having access to Internet and MPLS. Org VDC GW 1 is using static route to MPLS VPN B and also has MPLS transit network accessible as imported Org VDC network, while Org VDC GW 2 is using BGP to MPLS VPN A. Connectivity to the internet is provided by another layer of NSX-T Tier-0 GW which allows usage of overlay segmens as VRF uplinks and does not waste physical VLANs.

One comment on usage of NAT in such design. Usually the tenant wants to source NAT only towards the Internet but not to the MPLS. In NSX-V backed Org VDC Edge GW this is easily set on per uplink interface basis. However, that option is not possible on Tier-1 backed Org VDC Edge GW as it has only one transit towards Tier-0/VRF. Instead NO SNAT rule with destination must be used in conjunction with SNAT rule.

An example:

NO SNAT: internal 10.1.1.0/22 destination 10.1.0.0/16
SNAT: internal 10.1.1.0/22 translated 80.80.80.134

The above example will source NAT 10.1.1.0 network only to the internet.

Quotas and Quota Policies in VMware Cloud Director

In this article I want to highlight a new neat feature in VMware Cloud Director 10.2 – the ability to assign quotas and create quota policies.

This can be done at multiple levels both by service provider or organization administrator.

The following resources today can be managed via quotas:

  • Memory
  • CPU
  • Storage
  • All VMs (includes vApp template VMs)
  • Running VMs
  • TKG Clusters

The list might expand in the future so you can easily find what quota capabilities are available via API.

The service provider can create quotas at the organization level in the Organization > Configure > Quotas section:

The org administrator can assign quota to individual users or groups. This is done from the Administration > Access Control > User or Group  > Set Quota section.
The assignment of a quota at the group level is inherited by each group user (so it is not enforced at the aggregate group level) but can be overridden at the individual user quota level. Also if a user is member of multiple groups the least restrictive combination of participating group quotas will be applied to her.

At the same place the user or org admin can see the actual user’s usage compared to the quota.

Org admins can use quotas to easily control good behavior of org users (not running too many VMs concurrently, not consuming too much storage, etc.), while system admins can set safety quotas at org level when using Org VDC allocation models with unlimited consumption with Pay per use billing.

One hidden feature available only via API is the ability to create more generic quota policies that can combine (pool) multiple quota elements and use those to assign them to organizations, groups or individual users. Think of quota policy: Power User vs Regular User, where the former can power on more VMs.

When a specific quota is assigned at the user/group/org object, quota policy is created in the backend anyway but is specific just to the one object, while edit of Power User quota policy would be applied to every user that has such quota policy.

The feature comes with new specific rights so can be easily enabled or disabled:

  • Organization: Manage Quotas of Organization
  • Organization: Edit Quotas Policy
  • General: View Quota Policy Capabilities
  • General: Manage Quota Policy
  • General: View Quota Policy

New Networking Features in VMware Cloud Director 10.2

The 10.2 release of VMware Cloud Director from networking perspective was a massive one. NSX-V vs NSX-T gap was closed and in some cases NSX-T backed Org VDCs now provide more networking functionality than the NSX-V backed ones. UI has been redesigned with new dedicated Networking sections however some new features are currently available only in API.
Let me dive straight in so you do not miss any.

NSX-T Advanced Load Balancing (Avi) support

This is a big feature that requires its own blog post. Please read here. In short, NSX-T backed Org VDCs can now consume network load balancer services that are provided by the new NSX-T ALB / Avi.

Distributed Firewall and Data Center Groups

Another big feature combines Cross VDC networking, shared networks and distributed firewall (DFW) functionality. The service provider first must create Compute Provider Scope. This is basically a tag – abstraction of compute fault domains / availability zones and is done either at vCenter Server level or at Provider VDC level.

The same can be done for each NSX-T Manager where you would define Network Provider Scope.

Once that is done, the provider can create Data Center Group(s) for a particular tenant. This is done from the new networking UI in the Tenant portal by selecting one or multiple Org VDCs. The Data Center Group will now become a routing domain with networks spanning all Org VDCs that are part of the group, with a single egress point (Org VDC Gateway) and the distributed firewall.

Routed networks will automatically be added to a Data Center Group if they are connected to the group Org VDC Edge Gateway. Isolated networks must be added explicitly. An Org VDC can be member of multiple Data Center Groups.

If you want the tenant to use DFW, it must be explicitly enabled and the tenant Organization has to have the correct rights. The DFW supports IP Sets and Security Groups containing network objects that apply rules to all connected VMs.

Note that only one Org VDC Edge Gateway can be added to the Data Center Group. This is due to the limitation that NSX-T logical segment can be attached and routed only via single Tier-1 GW. The Tier-1 GW is in active / standby mode and can theoretically span multiple sites, but only single instance is active at a time (no multi-egress).

VRF-Lite Support

VRF-Lite is an object that allows slicing single NSX-T Tier-0 GW into up to 100 independent virtual routing instances. Lite means that while these instances are very similar to the real Tier-0 GW they do support only subset of its features: routing, firewalling and NATing.

In VCD, when tenant requires direct connectivity to on-prem WAN/MPLS with fully routed networks (instead of just NAT-routed ones), in the past the provider had to dedicated a whole external network backed by Tier-0 GW to such tenant. Now the same can be achieved with VRF which greatly enhances scalability of the feature.

There are some limitations:

  • VRF inherits its parent Tier-0 deployment mode (HA A/A vs A/S, Edge Cluster), BGP local ASN and graceful restart setting
  • all VRFs will share its parent uplinks physical bandwidth
  • VRF uplinks and peering with upstream routers must be individually configured by utilizing VLANs from a VLAN trunk or unique Geneve segments (if upstream router is another Tier-0)
  • an an alternative to the previous point EVPN can be used which allows single MP BGP session for all VRFs and upstream routers with data plane VXLAN encapsulation. Upstream routers obviously must support EVPN.
  • the provider can import into VCD as an external network either the parent Tier-0 GW or its child VRFs, but not both (mixed mode)

IPv6

VMware Cloud Director now supports dual stack IPv4/IPv6 (both for NSX-V and NSX-T backed networks). This must be currently enabled via API version 35 either during network creation or via PUT on the OpenAPI network object by specifying:

“enableDualSubnetNetwork”: true

In the same payload you also have to add the 2nd subnet definition.

 

PUT https://{{host}}/cloudapi/1.0.0/orgVdcNetworks/urn:vcloud:network:c02e0c68-104c-424b-ba20-e6e37c6e1f73

...
    "subnets": {
        "values": [
            {
                "gateway": "172.16.100.1",
                "prefixLength": 24,
                "dnsSuffix": "fojta.com",
                "dnsServer1": "10.0.2.210",
                "dnsServer2": "10.0.2.209",
                "ipRanges": {
                    "values": [
                        {
                            "startAddress": "172.16.100.2",
                            "endAddress": "172.16.100.99"
                        }
                    ]
                },
                "enabled": true,
                "totalIpCount": 98,
                "usedIpCount": 1
            },
            {
                "gateway": "fd13:5905:f858:e502::1",
                "prefixLength": 64,
                "dnsSuffix": "",
                "dnsServer1": "",
                "dnsServer2": "",
                "ipRanges": {
                    "values": [
                        {
                            "startAddress": "fd13:5905:f858:e502::2",
                            "endAddress": "fd13:5905:f858:e502::ff"
                        }
                    ]
                },
                "enabled": true,
                "totalIpCount": 255,
                "usedIpCount": 0
            }
        ]
    }
...
    "enableDualSubnetNetwork": true,
    "status": "REALIZED",
...

 

The UI will still show only the primary subnet and IP address. The allocation of the secondary IP to VM must be either done from its guest OS or via automated network assignment (DHCP, DHCPv6 or SLAAC). DHCPv6 and SLAAC is only available for NSX-T backed Org VDC networks but for NSX-V backed networks you could use IPv6 as primary subnet (with IPv6 pool) and IPv4 with DHCP addressing as the secondary.

To enable IPv6 capability in NSX-T the provider must enable it in Global Networking Config.
VCD automatically creates ND (Neighbor Discovery) Profiles in NSX-T for each NSX-T backed Org VDC Edge GW. And via /1.0.0/edgeGateways/{gatewayId}/slaacProfile API the tenant can set the Edge GW profile either to DHCPv6 or SLAAC. For example:
PUT https://{{host}}/cloudapi/1.0.0/edgeGateways/urn:vcloud:gateway:5234d305-72d4-490b-ab53-02f752c8df70/slaacProfile
{
    "enabled": true,
    "mode": "SLAAC",
    "dnsConfig": {
        "domainNames": [],
        "dnsServerIpv6Addresses": [
            "2001:4860:4860::8888",
            "2001:4860:4860::8844"
        ]
    }
}

And here is the corresponding view from NSX-T Manager:

And finally a view on deployed VM’s networking stack:

DHCP

Speaking of DHCP, NSX-T supports two modes. Network mode (where DHCP service is attached directly to a network and needs an IP from that network) and Edge mode where the DHCP service runs on Tier-1 GW loopback address. VCD now supports both modes (via API only). The DHCP Network mode will work for isolated networks and is portable with the network (meaning the network can be attached or disconnected from the Org VDC Edge GW) without DHCP service disruption. However, before you can deploy DHCP service in Network mode you need to specify Services Edge Cluster (for Edge mode that is not needed as the service runs on the Tier-1 Edge GW).  The cluster definition is done via Network Profile at Org VDC level.

In order to use DHCPv6 the network must be configured in Network mode and attached to Org VDC Edge GW with SLAAC profile configured with DHCPv6 mode.

Other Features

  • vSphere Distributed Switch support for NSX-T segments (also known as Converged VDS), although this feature was already available in VCD 10.1.1+
  • NSX-T IPSec VPN support in UI
  • NSX-T L2VPN support, API only
  • port group backed external networks (used for NSX-V backed Org VDCs) can now have multiple port groups from the same vCenter Server instance (useful if you have vDS per cluster for example)
  • /31 external network subnets are supported
  • Org VDC Edge GW object now supports metadata

NSX-V vs NSX-T Feature Parity

Let me conclude with an updated chart showing comparison of NSX-V vs NSX-T features in VMware Cloud Director 10.2. I highlighted new additions in green.

VMware Cloud Director – Storage IOPS Management – Part II

This is a follow up to the article I posted about a year ago that describes new IOPS management functionality in VMware Cloud Director (VCD) 10.2.

Storage IOPS  is next to compute, networking and storage capacity a limited resource service providers want to manage in order to fairly share underlying physical resources in a multitenant environment.

As was described in the original article VCD supported storage IOPS management  however the feature was quite hidden and available only via API. The recent release of VMware Cloud Director not only fully exposes the functionality in the UI but also adds some new functionality. Let’s dive into it.

There are two main mechanisms now how you can manage IOPS.

vCenter Server managed IOPS

This mechanism relies on setting IOPS limits at storage policy level directly in vCenter Server. That is possible with host based and with vSAN based storage policies. This mechanism is quite simple – when a VM disk is provisioned to such IOPS limited storage policy it will inherit the IOPS limit –  a constant number per policy. You will not be able to set proportional IOPS based on disk capacity.

vSAN Storage Policy with IOPS Limit
Host Based non vSAN Storage Policy with IOPS Limit

I would recommend using such mechanism only if you want to avoid noisy neighbors. The concept is not new, VCD could use such vSAN policies for some time and host based policies were already supported in VCD 10.1. The only difference is that now in 10.2 the tenant will see the limit reservation set at VM disk level but will not be able to change it.

Non-editable Disk IOPS

VCD Managed IOPS

This is much more sophisticated mechanism where you can really manage IOPS as pool of available capacity that you slice and allocate to tenant Org VDCs. This is the mechanism that was until now only available via API.

You will start by tagging your datastores with their IOPS capacity – that has not changed and still must be done from within VC via custom properties.

At Provider VDC level you can then create IOPS managed storage policies and define their service level in terms of disk IOPS defaults, maximums or IOPS allocation based on disk size (0 means unlimited).

This storage policy configuration can be inherited or overridden at Org VDC level. This is big improvement compared to the old approach where you had to create such storage policies always at Org VDC level.

Another new thing is that you can disable IOPS placement mechanism for such storage policy. This is useful in case you want to use Datastore Clusters. VCD will no longer try to place each virtual disk based on a particular datastore available IOPS. The placement decision is instead done by vCenter Server – you should therefore enable Storage DRS with I/O balancing automation. There is no need in such case to tag individual datastores in VC with their IOPS capacity.

Some of the old caveats still apply:

  • Disk IOPS can be assigned only to regular VMs or named (independent) disks, not to VM templates.
  • The disk IOPS will be always allocated against the Org VDC storage profile even if the VM is powered-off. This means the cloud provider can oversubscribe IOPS at the provider VDC storage profile level.
  • System administrator can override IOPS limits when deploying/editing tenant VMs in the system context.

Load Balancing with Avi in VMware Cloud Director

VMware Cloud Director 10.2 is adding network load balancing (LB) functionality in NSX-T backed Organization VDCs. It is not using the native NSX-T load balancer capabilities but instead relies on Avi Networks technology that was acquired by VMware about a year ago and since then rebranded to VMware NSX Advanced Load Balancer. I will call it Avi for short in this article.

The way Avi works is quite different from the way load balancing worked in NSX-V or NSX-T. Understanding the differences and Avi architecture is essential to properly use it in multitenant VCD environments.

I will focus only on the comparison with NSX-V LB as this is relevant for VCD (NSX-T legacy LB was never viable option for VCD environments).

In VCD in an NSX-V backed Org VDC the LB is running on Org VDC Edge Gateway (VM) that can have four different sizes (compact, large, quad large and extra large) and be in standalone or active / standby configuration. That Edge VM also needs to perform routing, NATing, firewalling, VPN, DHCP and DNS relay. Load balancer on a stick is not an option with NSX-V in VCD. The LB VIP must be an IP assigned to one of external or internally attached network interfaces of the Org VDC Edge GW.

Enabling load balancing on an Org VDC Edge GW in such case is easy as the resource is already there. 

In the case of Avi LB the load balancing is performed by external (dedicated to load balancing) components which adds more flexibility, scalability and features but also means more complexity. Let’s dive into it.

You can look at Avi as another separate platform solution similar to vSphere or NSX – where vSphere is responsible for compute and storage, NSX for routing, switching and security, Avi is now responsible for load balancing.

Picture is worth thousand words, so let me put this diagram here first and then dig deeper (click for larger version).

 

Control Path

You start by deploying Avi controller cluster (highly available 3 nodes) which integrates with vSphere (to use for compute/storage) and NSX-T (for routing LB data and control plane traffic). The controllers would sit somewhere in your management infrastructure next to all other management solutions.

The integration is done by setting up so called NSX-T Cloud in Avi where you define vCenter Server (only one is supported per NSX-T Cloud) and NSX-T Manager endpoints, NSX-T overlay transport zone (with 1:1 relationship between TZ and NSX-T Cloud definition). Those would be your tenant/workload VC/NSX-T.

You must also point to pre-created management network segment that will be used to connect all load balancing engines (more on them later) so they can communicate with the controllers for management and control traffic. To do so, in NSX-T you would set up dedicated Tier-1 (Avi Management) GW with the Avi Management segment connected and DHCP enabled. The expectation is the Tier-1 GW would be able through Tier-0 to reach the Avi Controllers.

Data Path

Avi Service Engines (SE) are VM resources to perform the load balancing. They are similar to NSX-T Edge Nodes in a sense that the load balancing virtual services can be placed on any SE node based on capacity or reservations (as Tier-1 GW can be placed on any Edge Node). Per se there is no strict relationship between tenant’s LB and SE node. SE node can be shared across Org VDC Edge GWs or even tenants. SE node is a VM with up to 10 network interfaces (NICs). One NIC is always needed for the management/control traffic (blue network). The rest (9) are used to connect to the Org VDC Edge GW (Tier-1 GW) via a Service Network logical segment (yellow and orange). The service networks are created by VCD when you enable load balancing service on the Org VDC Edge GW together with DHCP service to provide IP addresses for the attached SEs. It will by default get 10.255.255.0/25 subnet, but the system admin can change it, if it clashes with existing Org VDC networks. Service Engines run each service interface in a different VRF context so there is no worry about IP conflicts or even cross tenant communication.

When a load balancing pool and virtual service is configured by the tenant Avi will automatically pick a Service Engine to instantiate the LB service. It might even need to first deploy (automatically) an SE node if there is no existing capacity. When SE is assigned Avi will configure static route (/32) on the Org VDC Edge GW pointing the virtual service VIP (virtual IP) to the service engine IP address (from the tenant’s LB service network).

Note: The VIP contrary to NSX-V LB can be almost any arbitrary IP address. It can be routable external IP address allocated to the Org VDC Edge GW or any non-externally routed address but it cannot clash with any existing Org VDC networks. or with the LB service network. If you use an external Org VDC Edge GW allocated IP address you cannot use the address for anything else (e.g. SNAT or DNAT). That’s the way NSX-T works (no NAT and static routing at the same time). So for example if you want to use public address 1.2.3.4 for LB on port 80 but at the same time use it for SNAT, use an internal IP for the LB (e.g. 172.31.255.100) and create DNAT port forwarding rule to it (1.2.3.4:80 to 172.31.255.100:80).

Service Engine Groups

With the basics out of the way let’s discuss how can service provider manage the load balancing quality of service – performance, capacity and availability. This is done via Service Engine Groups (SEG).

SEGs are (today) configured directly in Avi Controller and imported into VCD. They specify SE node sizing (CPU, RAM, storage), bandwidth restrictions, virtual services maximums per node and availability mode.

The availability mode needs more explanation. Avi supports four availability modes:
A/S … legacy (only two nodes are deployed), service is active only on one node at a time and stand by on the other, no scale out support (service across nodes), very fast failover

A/A … elastic, service is active on at least two SEs, session info is proactively replicated, very fast failover

N+M … elastic, N is number of SE nodes service is scaled over, M is a buffer in number of failures the group can sustain, slow failover (due to controller need to re-assign services), but efficient SE utilization

N+0 … same as N+M but no buffer, the controller will deploy new SE nodes when failure occurs. The most efficient use of resources but the slowest failover time.

The base Avi licensing supports only legacy A/S high availability mode. For best availability and performance usage of elastic A/A is recommended.

As mentioned Service Engine Groups are imported into VCD where the system administrator makes a decision whether SEG is going to be dedicated (SE nodes from that group will be attached to only one Org VDC Edge GW) or shared.

Then when load balancing is enabled on a particular Org VDC Edge GW, the service provider assigns one or more SEGs to it together with capacity reservation and maximum in terms of virtual services for the particular Org VDC Edge GW.

Use case examples:

  • A/S dedicated SEG for each tenant / Org VDC Edge GW. Avi will create two SE nodes for each LB enabled Org VDC Edge GW and will provide similar service as LB on NSX-V backed Org VDC Edge GW did. Does not require additional licensing but SEG must be pre-created for each tenant / Org VDC Edge GW.
  • A/A elastic shared across all tenants. Avi will create pool of SE nodes that are going to be shared. Only one SEG is created. Capacity allocation is managed in VCD, Avi elastically deploys and undeploys SE nodes based on actual usage (the usage is measured in number of virtual services, not actual throughput or request per seconds).

Service Engine Node Placement

The service engine nodes are deployed by Avi into the (single) vCenter Server associated with the NSX Cloud and they live outside of VMware Cloud Director management. The placement is defined in the service engine group definition (you must use Avi 20.1.2 or newer). You can select vCenter Server folder and limit the scope of deployment to list of ESXi hosts and datastores. Avi has no understanding of vSphere host, and datastore clusters or resource pools. Avi will also not configure any DRS anti-affinity for the deployed nodes (but you can do so post-deployment).

Conclusion

The whole Avi deployment process for the system admin is described in detail here. The guide in the link refers to general Avi deployment of NSX-T Cloud, however for VCD deployment you would just stop before the step Creating Virtual Service as that would be done from VCD by the tenant.

Avi licensing is basic or enterprise and is set at Avi Controller cluster level. So it is possible to mix both licenses for two tier LB service by deploying two Avi Controller cluster instances and associating each with a different NSX-T transport zone (two vSphere clusters or Provider VDCs).

The feature differences between basic and enterprise editions are quite extensive and complex. Besides Service Engine high availability modes the other important difference is access to metrics, amount of application types, health monitors and pool selection algorithms.

The Avi usage metering for licensing purposes is currently done via Python script that is ran at the Avi Controller to measure Service Engine total  high mark vCPU usage during a given period and must be reported manually. The basic license is included for free with VCPP NSX usage and is capped to 1 vCPU per 640 GB reported vRAM of NSX base usage.

Update 2020/10/23: Make sure to check interoperability matrix. As of today only Avi 20.1.1 is supported with VCD 10.2.