New Networking Features in VMware Cloud Director 10.4.1

The recently released new version of VMware Cloud Director 10.4.1 brings quite a lot of new features. In this article I want to focus on those related to networking.

External Networks on Org VDC Gateway (NSX-T Tier-1 GW)

External networks that are NSX-T Segment backed (VLAN or overlay) can now be connected directly to Org VDC Edge Gateway and not routed through Tier-0 or VRF the Org VDC GW is connected to. This connection is done via the service interface (aka Centralize Service Port – CSP) on the service node of the Tier-1 GW that is backing the Org VDC Edge GW. The Org VDC Edge GW still needs a parent Tier-0/VRF (now called Provider Gateway) although it can be disconnected from it.

What are some of the use cases for such direct connection of the external network?

  • routed connectivity via dedicated VLAN to tenant’s co-location physical servers
  • transit connectivity across multiple Org VDC Edge Gateways to route between different Org VDCs
  • service networks
  • MPLS connectivity to direct connect while internet is accessible via shared provider gateway

The connection is configured by the system administrator. It can use only single subnet from which multiple IPs can be allocated to the Org VDC Edge GW. One is configured directly on the interface while the others are just routed through when used for NAT or other services.

If the external network is backed by VLAN segment, it can be connected to only one Org VDC Edge GW. This is due to a NSX-T limitation that a particular edge node can use VLAN segment only for single logical router instantiated on the edge node – and as VCD does not give you the ability to select particular edge nodes for instantiation of Org VDC Edge Tier-1 GWs it simply will not allow you to connect such external network to multiple Edge GWs. If you are sharing the Edge Cluster also with Tier-0 GWs make sure that they do not use the same VLAN for their uplinks too. Typically you would use VLAN for co-location or MPLS direct connect use case.

Overlay (Geneve) backed external network has no such limitations which makes it a great use case for connectivity across multiple GWs for transits or service networks.

NSX-T Tier-1 GW does not provide any dynamic routing capabilities, so routing to such network can be configured only via static routes. Also note that Tier-1 GW has default route (0.0.0.0/0) always pointing towards its parent Tier-0/VRF GW. Therefore if you want to set default route to the segment backed external network you need to use two more specific routes. For example:

0.0.0.0/1 next hop <MPLS router IP> scope <external network>
128.0.0.0/1 next hop <MPLS router IP> scope <external network>

Slightly related to this feature is the ability to scope gateway firewall rules to a particular interface. This is done via Applied To field:

You can select any CSP interface on the Tier-1 GW (used by the external network or non-distributed Org VDC networks) or nothing, which means the rule will be applied to uplink to Tier-0/VRF as well as to any CSP interface.

IP Spaces

IP Spaces is a big feature that will be delivered across multiple releases, where 10.4.1 is the first one. In short IP Spaces are new VCD object that allows managing individual IPs (floating IPs) and subnets (prefixes) independently across multiple networks/gateways. The main goal is to simplify the management of public IPs or prefixes for the provider as well as private subnets for the tenants, however additional benefits are being able to route on shared Provider Gateway (Tier-0/VRF), use same dedicated parent Tier-0/VRF for multiple Org VDC Edge GWs or the ability to re-connect Org VDC Edge GWs to a differeent parent Provider Gateway.

Use cases for IP Spaces:

  • Self-service for request of IPs/prefixes
  • routed public subnet for Org VDC network on shared Provider Gateway (DMZ use case, where VMs get public IPs with no NATing performed)
  • IPv6 routed networks on a shared Provider Gateway
  • tenant dedicated Provider Gateway used by multiple tenant Org VDC Edge Gateways
  • simplified management of public IP addresses across multiple Provider Gateways (shared or dedicated) all connected to the same network (internet)

In the legacy way the system administrator would create subnets at the external network / provider gateway level and then add static IP pools from those subnet for VCD to use. IPs from those IP pools would be allocated to tenant Org VDC Edge Gateways. The IP Spaces mechanism creates standalone IP Spaces (which are collections of IP ranges (e.g. 192.168.111.128-192.168.111.255) and blocks of IP Prefixes(2 blocks of 192.168.111.0/28 – 192.168.111.0/28, 192.168.111.16/28 and 1 block of 192.168.111.32/27).

A particular IP Space is then assigned to a Provider Gateway (NSX-T Tier-0 or VRF GW) as IP Space Uplink:

An IP Space can be used by multiple Provider Gateways.

The tenant Org VDC Edge Gateway connected to such IP Space enabled provider gateway can then request floating IPs (from IP ranges) 

or assign IP block to routable Org VDC network which results into route advertisement for such network.

In the above case such network should be also advertised to the internet, the parent Tier-0/VRF needs to have route advertisement manually configured (see below IP-SPACES-ROUTING policy) as VCD will not do so (contrary to NAT/LB/VPN case).

Note: The IP / IP Block assignments are done from the tenant UI. Tenant needs to have new rights added to see those features in the UI.

The number of IPs or prefixes the tenant can request is managed via quota system at the IP Space level. Note that the system administrator can always exceed the quota when doing the request on behalf of the tenant.

A provider gateway must be designated to be used for IP Spaces. So you cannot combine the legacy and IP Spaces method of managing IPs on the same provider gatway. There is currently no way of converting legacy provider gateway to the one IP Spaces enabled, but such functionality is planned for the future.

Tenant can create their own private IP Spaces which are used for their Org VDC Edge GWs (and routed Org VDC networks) implicitly enabled for IP Spaces. This simplifies the creation of new Org VDC networks where a unique prefix is automatically requested from the private IP Space. The uniqueness is important as it allows multiple Edge GWs to share same parent provider gateway.

Transparent Load Balancing

VMware Cloud Director 10.4.1 adds support for Avi (NSX ALB) Transparent Load Balancing. This allows the pool member to see the source IP of the client which might be needed for certain applications.

The actual implementation in NSX-T and Avi is fairly complicated and described in detail here: https://avinetworks.com/docs/latest/preserve-client-ip-nsxt-overlay/. This is due to the Avi service engine data path and the need to preserve the client source IP while routing the pool member return traffic back to the service engine node.

VCD will hide most of the complexity so that to actually enable the service three steps need to be taken:

  1. Transparent mode must be enabled at the Org VDC GW (Tier-1) level.

2. The pool members must be created via NSX-T Security Group (IP Set).

3. Preserve client IP must be configured on the virtual service.

Due to the data path implementation there are quite many restrictions when using the feature:

  • Avi version must be at least 21.1.4
  • the service engine group must be in A/S mode (legacy HA)
  • the LB service engine subnet must have at least 8 IPs available (/28 subnet – 2 for engines, 5 floating IPs and 1 GW) – see Floating IP explanation below
  • only in-line topology is supported. It means a client that is accessing the LB VIP and not going via T0 > T1 will not be able to access the VS
  • only IPv4 is supported
  • the pool members cannot be accessed via DNAT on the same port as the VS port
  • transparent LB should not be combined with non-transparent LB on the same GW as it will cause health monitor to fail for non-transparent LB
  • pool member NSX-T security group should not be reused for other VS on the same port

Floating IPs

When transparent load balancing is enabled the return traffic from the pool member cannot be sent directly to the client (source IP) but must go back to the service engine otherwise asymmetric routing happens and the traffic flows will be broken. This is implemented in NSX-T via N/S Service Insertion policy where the pool member (defined via security group) traffic is instead to its default GW redirected to the active engine with a floating IP. Floating IPs are from the service engine network subnet but are not from the DHCP range which assigns service engine nodes their primary IP. VCD will dedicate 5 IP from the LB Service network range for floating IPs. Note that multiple transparent VIPs on the same SEG/service network will share floating IP.

VMware Cloud Director 10.4.1 Appliance Upgrade Notes

VMware Cloud Director 10.4.1 (VCD) has just been released and with it comes also the change of the appliance internal PostgreSQL database from version 10 to 14 (as well as replication manager upgrade). This means that the standard upgrade process has changed and is now a bit more complicated and could be significantly longer than in the past. So pay attention to this change. This article goal is to explain what is going on in the background to understand better why certain steps are necessary.

VCD 10.4.1 is still interoperable with PostgreSQL 10-14 if you use the Linux deployment format, so this blog applies only if you use the appliance deployment. PostgreSQL version 10 is no longer maintained so that is the main reason for the switch besides the fact that newer is alway better :-).

  • The database upgrade will happen during the vamicli update process.
  • All appliance nodes must be up but the vmware-vcd service should be shut down.
  • Always recommended is to take cold snaphots of all nodes before the upgrade (make sure the snapshots are done while all DB nodes are off at the same time as you do not want to restore snapshot of primary to older state while secondary nodes are ahead)
  • Primary database appliance node is where you have to start the upgrade process, it will most likely take the longest time as new database version 14 will be installed side-by-side and all the data converted. That means you will also need enough free space on the database partition. You can check by running on the primary DB node:

    df -h|grep postgres
    /dev/mapper/database_vg-vpostgres 79G 17G 58G 23% /var/vmware/vpostgres


    The above shows that the partition size is 79 GB, the DB is currently using 17 GB and I have 53 GB free. So I am good to go. The actual additional needed space is less than 17 GB as the database logs and write ahead logs are not copied over. Those can be deducted. You can quickly get their size by running:

    du -B G --summarize /var/vmware/vpostgres/current/pgdata/pg_wal/
    du -B G --summarize /var/vmware/vpostgres/current/pgdata/log/


    Or just use this one-liner:

    du -sh /var/vmware/vpostgres/current/pgdata/ --exclude=log --exclude=pg_wal

    If needed, the DB partition can be easily expanded.
  • The secondary DB nodes will during the vami upgrade process clone the upgraded DB from the primary via replication. So these nodes just drop the current DB and will not need any additional space.
  • DB upgrade process can be monitored by tailing in another ssh session update-postgres-db.log file:

    tail -f /opt/vmware/var/log/vcd/update-postgres-db.log
  • After all nodes (DB and regular ones) are upgraded, the database schema is upgraded via the /opt/vmware/vcloud-director/bin/upgrade command. Then you must reboot all nodes (cells).
  • The vcloud database password will be changed to a new autogenerated 14 character string and will be replicated to all nodes of VCD cluster. If you use your own tooling to access the DB directly you might want to change the password as there is no way of retrieve the autogenerated one. This must be done by running psql in the elevated postgres account context.

    root@vcloud1 [ /tmp ]# su postgres
    postgres@vcloud1 [ /root ]$ psql -c "ALTER ROLE vcloud WITH PASSWORD 'VMware12345678'"



    and then you must update vcd-service on each cell via the CMT reconfigure-database command. This can be done in a fan-out mode from a single cell live by running:

    /opt/vmware/vcloud-director/bin/cell-management-tool reconfigure-database -dbpassword 'VMware12345678' --private-key-path=/opt/vmware/vcloud-director/id_rsa --remote-sudo-user=postgres -i `cat /var/run/vmware-vcd-cell.pid`

    The command above will change DB configuration properties on the local and all remote cells. It will also refresh the running service to use the new password.

    Note the DB password must have at least 14 characters.
  • Any advanced PostgreSQL configuration options will not be retained. In fact they may be incompatible with PostgreSQL 14 (they are backed up in /var/vmware/vpostgres/current/pgdata/postgresql.auto.old)

Enhanced Security for Cross vCenter Operations in VMware Cloud Director

Starting with VMware Cloud Director 10.4.1 VM and vApp operations that happen across Provider VDCs backed by different vCenter Server instances require that both vCenter Server trust each other certificates.

Which operations are affected?

Any VM move, clone or vApp move, clone and template deployments that happen via UI or API where source and destinations are managed by different vCenter Server.

If involved vCenter Server do not trust each other certificates, those operations will fail with an error:

Underlying system error: com.vmware.vim.binding.vim.fault.SSLVerifyFault

In the VCD cell logs you will see something similar to:

2022-11-18 17:15:46,023 | ERROR    | vim-proxy-activity-pool-482 | RelocateVmActivity             | Underlying system error: com.vmware.vim.binding.vim.fault.SSLVerifyFault | requestId=9554d239-a457-48bd-988d-4aa7856cdba2,request=POST https://10.196.235.31/api/vdc/ceb9bd94-171e-4d3f-a390-248791cbb218/action/moveVApp,requestTime=1668791732601,remoteAddress=10.30.65.110:53169,userAgent=PostmanRuntime/7.29.2,accept=application/*+xml;version 37.0 vcd=b4856f34-94bb-4462-8d1b-469483f98f69,task=46d10139-134e-4823-a674-dd0e10a5565a activity=(com.vmware.vcloud.backendbase.management.system.TaskActivity,urn:uuid:46d10139-134e-4823-a674-dd0e10a5565a) activity=(com.vmware.vcloud.vdc.impl.MoveVAppActivity,urn:uuid:a8c5d945-9ffc-44b7-aafb-54a634e0dd08) activity=(com.vmware.vcloud.vdc.impl.LinkVMsToTargetVAppActivity,urn:uuid:ed3c8d9f-69db-466e-bbbd-9046d996d3e6) activity=(com.vmware.vcloud.vdc.impl.MoveVmUsingVmotionActivity,urn:uuid:6839cde5-cff4-424f-9d24-940b0b78ef45) activity=(com.vmware.ssdc.backend.services.impl.RelocateVmActivity,urn:uuid:dd9d8304-c6e8-45eb-8ef0-65a17b765785) activity=(com.vmware.vcloud.fabric.storage.storedVm.impl.RelocateStoredVmByStorageClassActivity,urn:uuid:ff420cab-5635-45a0-bdf0-fee6fe33eb7e) activity=(com.vmware.vcloud.fabric.storage.storedVm.impl.RelocateStoredVmByDatastoreActivity,urn:uuid:6cde303d-ae82-4a3c-9ecc-cceb0efcdbe9) activity=(com.vmware.vcloud.val.internal.impl.RelocateVmActivity,urn:uuid:73b44746-52c3-4143-bb29-1d22b532cc55)
com.vmware.ssdc.library.exceptions.GenericVimFaultException: Underlying system error: com.vmware.vim.binding.vim.fault.SSLVerifyFault
        at com.vmware.ssdc.library.vim.LmVim.createGenericVimFaultException(LmVim.java:329)
        at com.vmware.ssdc.library.vim.LmVim.Convert(LmVim.java:445)
        at com.vmware.ssdc.library.vim.LmVim.Convert(LmVim.java:499)
        at com.vmware.vcloud.val.taskmanagement.AsynchronousTaskWaitActivity.getResultIfTaskAlreadyCompleted(AsynchronousTaskWaitActivity.java:449)
        at com.vmware.vcloud.val.taskmanagement.AsynchronousTaskWaitActivity$InitialPhase.invoke(AsynchronousTaskWaitActivity.java:123)
        at com.vmware.vcloud.activity.executors.ActivityRunner.runPhase(ActivityRunner.java:175)
        at com.vmware.vcloud.activity.executors.ActivityRunner.run(ActivityRunner.java:112)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: (vim.fault.SSLVerifyFault) {
   faultCause = null,
   faultMessage = null,
   selfSigned = false,
   thumbprint = B3:E8:35:77:E3:65:20:1A:BD:F2:DB:41:08:34:58:51:07:E2:AA:53
}

The system administrator thus must make sure that when there is more than one resource vCenter Server managed by VMware Cloud Director that there is mutual trust across all such vCenter Server instances. To reiterate, this must be ensured at vCenter Server level – no change is needed in VMware Cloud Director.

Typically, vCenter Servers use their own VMCA to issue their (VC and ESXi) certificates. Each such VMCA certificate must be imported to the other vCenter Server. The alternative is to use publicly signed CA instead of the self-signed VMCA. Then no action would be needed. And finally if you use your own Enterprise CA to issue VC certificates, then each VC just must trust the root Enterprise CA.

vCenter Server provides UI for certificate management in vSphere Client (Administration > Certificates > Certificate Management).

The source VMCA certificate can be retrieved via the download.zip from http://<source-VC>/certs/download.zip, extracted and uploaded via the ablove ADD UI.

Alternatively, the same can be accomplished via API from Developer Center > API Explorer by GET chain from source VC and upload it by POST trusted-root-chains to the target VC.

The third alternative is to use PowerCLI which also provides a way to manage vSphere certificates. An example script would look like this:

$sourceVC="vc01.fojta.com"
$targetVC="vc02.fojta.com"
connect-viserver $sourceVC
connect-viserver $targetVC
$trustedCerts = Get-VITrustedCertificate -server $sourceVC -VCenterOnly
foreach ($Cert in $trustedCerts)
  {Add-VITrustedCertificate -PemCertificateOrChain $cert.CertificatePEM -server $sourceVC -VCenterOnly -Confirm:$false}

As the trust must be mutual, if you have five resource vCenter Server each should have at least five trusted root certificates (one its own VMCA and four for the other VCs).

VMware Cloud Provider Lifecycle Manager 1.4

VMware has just released VMware Cloud Provider Lifecycle Manager version 1.4. I have blogged about the product in the past. I have been testing the version 1.4 in my lab and must say it is now significantly easier to use to deploy, manage new or brown field deployments of the VMware Cloud Director stack.

The UI has now almost full parity with the API. So it is now very easy to use the product from the UI for single environment management (at scale you would probably still want to use API). Additionally, existing VCD environments (either deployed buy VCPLCM or any other way) can be registered under management of VCPLCM. You will just need to provide a few basic information (solution endpoint address, passwords, etc.) which allows VCPLCM to run the discovery process and collect additional details (nodes, certificates, integrations, etc.). Once that is done you simply finish the registration by supplying missing details such as integration passwords or node names and their network port groups in VC inventory that could not be autodiscovered.

With VCD under management of VCPLCM the most annoying operations such as upgrades, certificate replacements or node additions/redeployments are very simple to perform and take just a few minutes.

Give it a try!

Metering External Network Traffic with NSX-T

It is common for service providers to meter internet traffic consumption of their tenants. In NSX-V environment you could get any interface statistics of an Edge Service Gateway via NSX API or vCloud Director Networking API:
GET /api/4.0/edges/{edgeId}/statistics/interfaces

Let us explore the ways you can do similar metering in NSX-T environment.

We have three points where we can meter external traffic:

  1. On Tier-0 or VRF uplink (if such object is dedicated to the tenant). The API to use is:

    GET https://{{host}}/policy/api/v1/infra/tier-0s/{tier-0 name}/locale-services/default/interfaces/{uplink-name}/statistics

    If the Tier-0/VRF is stretched across multiple Edge nodes you will need to aggregate the data across multiple uplinks.
  2. On Tier-0/VRF router downlink to a specific Tier-1 GW. This is useful if the Tier-0 is shared across multiple tenants and their Tier-1 GWs. The API is not documented in the API docs.

    GET https://{{host}}/policy/api/v1/infra/tier-0s/{tier-0 name}/tier-1-interface/statistics/summary?tier1_path=/infra/tier-1s/{tier-1 name}
  3. On CSP (Service Interface) of a Tier-1 (or Tier-0) GW for use cases where a particular VLAN is connected directly to tenant GW. The API to use is quite long and must include the enforcement point (edge node backing active Tier-1 GW)

    GET https://{{host}}/policy/api/v1/infra/tier-1s/{tier-1 name}/locale-services/{locale id}/interfaces/{CSP interface id}/statistics?enforcement_point_path=/infra/sites/default/enforcement-points/default&edge_path=/infra/sites/default/enforcement-points/default/edge-clusters/{edge-cluster-id}/edge-nodes/{edge-node-id}