vCloud Director 10: NSX-T Integration

Intro

vCloud Director relies on NSX network virtualization platform to provide on-demand creation and management of networks and networking services. NSX for vSphere has been supported for long time and vCloud Director allows most of its feature to be used by its tenants. However as VMware slowly shifts away from NSX for vSphere and pushes forward modern, fully rewritten NSX-T networking platform, I want to focus in this article on its integration with vCloud Director.

History

Let me start with highlighting that NSX-T is evolving very quickly. It means each release (now at version 2.5) adds major new functionality. Contrast that with NSX-V which is essentially feature complete in a sense that no major functionality change is happening there. The fast pace of NSX-T development is a challenge for any cloud management platforms as they have to play the catch up game.

The first release of vCloud Director that supported NSX-T was 9.5. It supported only NSX-T version 2.3 and the integration was very basic. All vCloud Director could do was to import NSX-T overlay logical segments (virtual networks) created manually by system administrator. These networks were imported into a specific tenant Org VDC as Org VDC networks.

The next version of vCloud Director – 9.7 supported only NSX-T 2.4 and from the feature perspective not much had changed. You could still only import networks. Under the hood the integration however used completely new set of NSX-T policy based APIs and there were some minor UI improvements in registering NSX-T Manager.

The current vCloud Director version 10 for the first time brings on-demand creation of NSX-T based networks and network services. NSX-T version 2.5 is required.

NSX-T Primer

While I do not want to go too deep into the actual NSX-T architecture I fully expect that not all readers of this blog are fully familiar with NSX-T and how it differs from NSX-V. Let me quickly highlight major points that are relevant for topic of this blog post.

  • NSX-T is vCenter Server independent, which means it scales independently from vCenter domain. NSX-T essentially communicates with ESXi hosts directly (they are called host transport nodes). The hosts must be prepared with NSX-T vibs that are incompatible with NSX-V which means a particular host cannot be used by NSX-V and NSX-T at the same time.
  • Overlay virtual networks use Geneve encapsulation protocol which is incompatible with VXLAN. The concept of Controller cluster that keeps state and transport zone is very similar to NSX-V. The independence from VC mentioned in the previous point means vSphere distributed switch cannot be used, instead NSX-T brings its own N-VDS switch. It also means that there is concept of underlay (VLAN) networks managed by NSX-T. All overlay and underlay networks managed by NSX-T are called logical segments.
  • Networking services (such as routing, NATing, firewalling, DNS, DHCP, VPN, load balancing) are provided by Tier-0 or Tier-1 Gateways that are functionally similar to NSX-V ESGs but are not instantiated in dedicated VMs. Instead they are services running on shared Edge Cluster. The meaning of Edge Cluster is very different from the usage in NSX-V context. Edge Cluster is not a vSphere cluster, instead it is cluster of Edge Transport Nodes where each Edge Node is VM or bare metal host.
  • While T0 and T1 Gateways are similar they are not identical, and each has specific purpose or set of services it can offer. Distributed routing is implicitly provided by the platform unless a stateful networking service requires routing through single point. T1 GWs are usually connected to single T0 GW and that connection is managed automatically by NSX-T.
  • Typically you would have one or small number of T0 GWs in ECMP mode providing North-south routing (concept of Provider Edge) and large number of T1 GWs connected to T0 GW, each for a different tenant to provide tenant networking (concept of Tenant Edge).

vCloud Director Integration

As mentioned above since NSX-T is not vCenter Server dependent, it is attached to vCloud Director independently from VC.

(Geneve) network pool creation is the same as with VXLAN – you provide mapping to an existing NSX-T overlay transport zone.

Now you can create Provider VDC (PVDC) which is as usual mapped to a vSphere cluster or resource pool. A particular cluster used by PVDC must be prepared for NSX-V or NSX-T and all clusters must share the same NSX flavor. It means you cannot mix NSX-V clusters with NSX-T in the same PVDC. However you can easily share NSX-V and NSX-T in the same vCenter Server, you will then just have to create multiple PVDCs. Although NSX-T can span VCs, PVDC cannot – that limitation still remains. When creating NSX-T backed PVDC you will have to specify the Geneve Network Pool created in the previous step.

Within PVDC you can start creating Org VDCs for your tenants – no difference there.

Org VDCs without routable networks are not very useful. To remedy this we must create external networks and Org VDC Edge Gateways. Here the concept quite differs from NSX-V. Although you could deploy provider ECMP Edges with NSX-V as well (and I described here how to do so), it is mandatory with NSX-T. You will have to pre-create T0 GW in NSX-T Manager (ECMP active – active is recommended). This T0 GW will provide external networking access for your tenants and should be routable from the internet. Instead of just importing external network port group how you would do with NSX-V you will import the whole T0 GW in vCloud Director.

During the import you will also have to specify IP subnets and pools that the T0 GW can use for IP sub-allocation to tenants.

Once the external network exist you can create tenant Org VDC Edge Gateways. These will be T1 GWs instantiations into the same NSX-T Edge Cluster as the T0 GW they connect to. Currently you cannot chose different NSX-T Edge Cluster for their placement. T1 GWs are always deployed in Active x Standby configuration, the placement of active node is automated by NSX-T. The router interlink between T0 and T1 GWs is also created automatically by NSX-T.

During the Org VDC Edge Gateway the service providers also sub-allocates range of IPs from the external network. Whereas with NSX-V these would actually be assigned to the Org VDC Edge Gateway uplink, this is not the case with NSX-T. Once they are actually used in a specific T1 NAT rule, NSX-T will automatically create static route on the T0 GW and start routing to the correct T1 GW.

Tenant Networks

There are four types of NSX-T based Org VDC networks and three of them are available to be created via UI:

  • Isolated: Layer 2 segment not connected to T1 GW. DHCP service is not available on this network (contrary to NSX-V implementation).
  • Routed: Network that is connected to T1 GW. Note however that its subnet is not announced to upstream T0 GW which means only way to route to it is to use NAT.
  • Imported: Existing NSX-T overlay logical segment can be imported (same as in VCD 9.7 or 9.5). Its routing/external connectivity must be managed outside of vCloud Director.
  • In OpenAPI (POST /1.0.0/OrgVdcNetwork) you will find one more network type:  DIRECT_UPLINK. This is for a specific NFV use case. Such network is connected directly to T0 GW with external interface. Note this feature is not officially supported!

Note that only Isolated and NAT-routed networks can be created by tenants.

As you can see it is not possible today to create routed advertised Org VDC network (for example for direct connect use case when tenant wants to route from on-prem networks to the cloud without using NAT). These routed networks would require dedicated T0 GW for each tenant which would not scale well but might be possible in the future with VRF support on T0 GWs.

Tenant Networking Services

Currently the following T1 GW networking services are available to tenants:

  • Firewall
  • NAT
  • DHCP (without binding and relay)
  • DNS forwarding
  • IPSec VPN: No UI, OpenAPI only. Policy and route based with pre share key is supported. (Thanks Abhi for the correction).

All other services are currently not supported. This might be due to NSX-T not having them implemented yet, or vCloud Director not catching up yet. Expect big progress here with each new vCloud Director and NSX-T release.

Networking API

All NSX-T related features are available in the vCloud Director OpenAPI (CloudAPI). The pass through API approach that you might be familiar with from the Advanced Networking NSX-V implementation is not used!

Feature Comparison

I have summarized all vCloud Director networking features in the following table for quick comparison between NSX-V and NSX-T.

Update 2019/10/09: Added two entries related to external network metering and rate limiting.

Load Balancing vCloud Director with NSX-T

I just have had a chance for the first time to set up vCloud Director installation that was fronted by NSX-T based load balancer (version 2.4.1). In the past I have blogged how to load balance vCloud Director cells with NSX-V:

Load Balancing vCloud Director Cells with NSX Edge Gateway

vCloud OpenAPI – Large Payload Issue with Load Balancer

NSX-T differs quite a lot from NSX-V therefore the need for this article. The load balancer instance is deployed into the NSX-T Edge Cluster which is a set of virtual or physical NSX-T Edge Nodes. There are also strict sizing guidelines related to the size and number of LB and size of Edge Nodes – see the official docs.

Certificates

Import your VCD public cert in the NSX Manager UI: System > Certificates > Import Certificate. You will need to provide name, full certificate chain, private key and set is as Service Certificate. If it is signed by Enterprise CA you will also before that import the CA cert.

Monitor

Create new monitor in Networking > Load Balancing > Monitors > Add Active Monitor HTTPs

  • protocol HTTPs
  • monitoring port 443
  • default timers
  • HTTP Request Configuration: GET /cloud/server_status, HTTP Request Version: 1
  • HTTP Response Configuration: HTTP response body: Service is up.
  • SSL Configuration: Enabled, Client Certificate: None

Profiles

Application Profile

Networking > Load Balancing > Profiles > Select Profile Type: Application > Add Application Profile > HTTP

Here in the UI we can set only Request Header Size and Request Body Size. Set both to 65535 maximum (65535 for header size and at least 52428800 for body size as ISO/OVA uploads use 50 MB chunks). We will later use API to also configure Response Header Size.

Persistence and SSL Profiles

I will reuse existing default-source-ip-lb-persistence-profile and default-balanced-client-ssl-profile.

Server Pools

Networking > Load Balancing > Server Pools > Add Server Pool

  • Algorithm: Least Connection
  • Active Monitor: picked the one created before
  • Select members: Enter individual members (do not enter port as we will reuse the pool for multiple ports)

 

Virtual Servers

We will add two virtual servers. One for UI/API and another for VM Remote Console connections. For both I have picked the same IP address from the cell logical segment. Ports will be different (443 vs 8443).

vCloud UI

  • Add virtual server: L7 HTTP
  • Ports: 443
  • Ignore Load Balancer placement for now
  • Server Pool: the one we created before
  • Application Profile: the one we created before
  • Persistence: default-source-ip-lb-persistence-profile
  • SSL Configuration: Client SSL: Enabled, Default Certificate: the one we imported before, Client SSL Profile: default-balanced-client-ssl-profile
    Server SSL: Enabled, Client Certificate: None, Server SSL Profile: default-balanced-client-ssl-profile

vCloud Console

  • Add virtual server: L4 TCP
  • Ports: 8443
  • Ignore Load Balancer placement for now
  • Server Pool: the one we created before
  • Application Profile: default-tcp-lb-app-profile
  • Persistence: disabled

Load Balancer

Now we can create load balancer instance and associate the virtual servers with it. Create the LB instance on the Tier 1 Gateway which routes to your VCD cell network. Make sure the Tier 1 Gateway runs on an Edge node with the proper size (see the doc link before).

Networking > Load Balancing > Load Balancers > Add Load Balancer

  • Size: small
  • Tier 1 Gateway
  • Add Virtual Servers: add the 2 virtual servers created in the previous step

Now we have the load balancer up and running you should get all green in the status column. We are not done yet though.

Firstly we need to increase the response header size as vCloud Director Open API sends huge headers with links. Without this, you would get H5 UI errors (Nginx 502 Bad Gateway) and some API calls would fail.  This can be currently done only with NSX Policy API. Fire up Postman or Curl and do GET and then PUT on the following URI:

NSX-manager/policy/api/v1/infra/lb-app-profiles/<profile-name>

in the payload change the response_header_size to at least 10240 50000 bytes.

And finally we will need to set up NAT so our load balanced virtual servers are available both from the outside world (on Tier-0 Gateway) as well from the internal networks. This is quite network topology specific, but do not forget the cells itself must properly connect to the public (load balanced) URL configured in vCloud Director public addresses.

NSX-T 2.4: Force Local Account Login

NSX-T supports Role Based Access Control by integrating with VMware Identity Manager which provides access to 3rd party Identity Sources such as LDAP, AD, SAML2, etc.

When NSX-T version 2.3 is integrated with VIDM you would get a choice during the login which type of account you are going to provide (remote or local).

NSX-T version 2.4 no longer provides the option and will always default to the SAML source (VIDM). To force the login with local account provide this specific URL:

https://<NSX-T_FQDN/IP>/login.jsp?local=true