How to Migrate VMware Cloud Director from NSX-V to NSX-T (part 2)

This is an update of the original article with the same title published last year How to Migrate VMware Cloud Director from NSX-V to NSX-T and includes the new enhancements of the VMware NSX Migration for VMware Cloud Director version 1.2.1 that has been released yesterday.

The tool’s main purpose is to automate migration of VMware Cloud Director Organization Virtual Data Centers that are NSX-V backed to a NSX-T backed Provider Virtual Data Center. The original article describes how exactly it is accomplished and what is the impact of migrated workloads from the networking and compute perspective.

The migration tool is continually developed and additional features are added to either enhance its usability (improved roll back, simplified L2 bridging setup) or to support more use cases based on new features in VMware Cloud Director (VCD). And then there is a new assessment mode! Let me go into more details.

Directly Connected Networks

The VCD release 10.2.2 added support to use in NSX-T backed Org VDCs directly connected Organization VDC networks. Such networks are not connected to a VDC Gateway and instead are just connected directly to a port group backed external network. The typical usage is for service networks, backup networks or colocation/MPLS networks where routing via the VDC Gateway is not desired.

The migration tool now supports migration of these networks. Let me describe how it is done.

The VCD external network in NSX-V backed PVDC is port group backed. It can be backed by one or more port groups that are typically manually created VLAN port groups in vCenter Server or they can also be VXLAN backed (system admin would create NSX-V logical switch directly in NSX-V and then use its DVS port groups for the external network). The system administrator then can create in the Org VDC a directly connected network that is connected to this external network. It inherits its parent’s IPAM (subnet, IP pools) and when tenant connects a VM to it it is just wired to the backing port group.

The migration tool first detects if the migrated Org VDC direct network is connected to an external network that is also used by other VDCs and based on that behaves differently.

Colocation / MPLS use case

If the external network is not used by any other Org VDC and the backing port group(s) is VLAN type (if more port groups are used they must have the same VLAN), then it will create in NSX-T logical segment in VLAN transport zone (specified in the YAML input spec) and import it to the target Org VDC as imported network. The reason why direct connection to external network is not used is to limit the external network sprawl as the import network feature perfectly matches the original use case intent. After the migration the source external network is not removed automatically and the system administrator should clean them up including the vCenter backing port groups at their convenience.

Note that no bridging is performed between the source and target network as it is expected the VLAN is trunked across source and target environments.

The diagram below shows the source Org VDC on the left and the target one on the right.

Service Network Use Case

If the external network is used by other Org VDCs, the import VLAN segment method cannot be used as each imported Org VDC network must be backed by its own logical segment and has its own IPAM (subnet, pool). In this case the tool will just create directly connected Org VDC network in the target VDC connected to the same external network as the source. This requires that the external network is scoped to the target PVDC – if the target PVDC is using different virtual switch you will need first to create regular VLAN backed port group there and then add it to the external network (API only currently). Also only VLAN backed port group can be used as no bridging is performed for such networks.

Assessment Mode

The other big feature is the assessment mode. The main driver for this feature is to enable service providers to see how much ready their environment is for the NSX-V to T migration and how much redisign will be needed. The assessment can be triggered against VCD 10.0, 10.1 or 10.2 environments and only requires VCD API access (the environment does not yet need to be prepared for NSX-T).

The tool will during the assessment check all or specified subset of NSX-V backed Org VDCs and assess every feature used there that impacts its migration viability. Then it will provide detailed and summarized report where you can see what ratio of the environment *could* be migrated (once upgraded to the latest VCD 10.2.2). This is provided in Org VDC, VM and used RAM units.

The picture below shows example of the summary report:

Note that if there is one vApp in a particular Org VDC that cannot be migrated, the whole Org VDC is counted as not possible to migrate (in all metrics VM and RAM). Some features are categorized as blocking – they are simple not supported by either NSX-T backed Org VDC or the migration tool (yet), but some issues can be mitigated/fixed (see the remediation recommendations in the user guide).


As mentioned the migration tool is continuosly developed and improved. Together with the next VMware Cloud Director version we can expect additional coverage of currently unsupported features. Especially the shared network support is high on the radar.

VMware Cloud Provider Lifecycle Manager

VMware Cloud Provider Lifecycle Manager is a new product just released in version 1.1. The version 1.0 was not generaly available and thus not widely known. Let me therefore briefly describe what it is and what it can do.

As the name indicates its main goal is to simplify deployment and lifecycle of VMware’s Cloud Provider solutions. Currently in scope are:

  • VMware Cloud Director (10.1.x or 10.2.x)
  • Usage Meter (4.3 and 4.4)
  • vRealize Operations Tenant App (2.4 and 2.5)
  • RabbitMQ (Bitnami based)

The product itself ships as a stateless Docker image that can be deployed as a container for example in Photon OS VM. It has no GUI, but provides REST API. The API calls support the following actions:

  • Deployment of an environment that can consist of one or more products (VCD, UM, …)
  • Upgrade of an environment and product
  • Certificate management
  • Node managment (adding, removing, redeploying nodes)
  • Integration management (integration of a specific products with others)

The image below shows most of the Postman Collection API calls available:

The whole environment (or its product subset) is described in JSON format that is supplied in the API payload. The example below shows payload to deploy VCD with three cells, includes necessary certificates, target vSphere environment and integration with vSphere, NSX-T and RabbitMQ including creation of Provider VDC.

    "environmentName": "{{vcd_env_id}}",
    "products": [
            "properties": {
                "installationId": 1,
                "systemName": "vcd-1-vms",
                "dbPassword": "{{password}}",
                "keystorePassword": "{{password}}",
                "clusterFailoverMode": "MANUAL",
                "publicAddress": {
                    "consoleProxyExternalAddress": "{{vcd_lb_ip}}:8443",
                    "restApiBaseHttpUri": "http://{{vcd_lb_ip}}",
                    "restApiBaseUri": "https://{{vcd_lb_ip}}",
                    "tenantPortalExternalHttpAddress": "http://{{vcd_lb_ip}}",
                    "tenantPortalExternalAddress": "https://{{vcd_lb_ip}}"
                "adminEmail": "",
                "adminFullName": "admin",
                "nfsMount": "{{vcd_nfs_mount}}"
            "certificate": {
                "product": {
                    "certificate": "{{vcd_cert}}",
                    "privateKey": "{{vcd_cert_key}}"
                "restApi": {
                    "certificate": "{{vcd_cert}}"
                "tenantPortal": {
                    "certificate": "{{vcd_cert}}"
            "integrations": [
                    "integrationId": "vcd-01-to-vc-01",
                    "datacenterComponentType": "VCENTER",
                    "hostname": "{{vcenter_hostname}}.{{domainName}}",
                    "integrationUsername": "administrator@vsphere.local",
                    "integrationPassword": "{{vc_password}}",
                    "properties": {
                        "providerVdcs": {
                                "PVDC-1": {
                                "description": "m01vc01-comp-rp",
                                "highestSupportedHardwareVersion": "vmx-14",
                                "isEnabled": true,
                                "clusterName": "{{vc_cluster}}",
                                "resourcePoolname": "{{pvdc_resource_pool}}",
                                "nsxIntegration": "vcd-01-to-nsx-01",
                    "integrationId": "vcd-01-to-nsx-01",
                    "datacenterComponentType": "NSXT",
                    "hostname": "{{nsxt_hostname}}.{{domainName}}",
                    "integrationUsername": "admin",
                    "integrationPassword": "{{nsx_password}}",
                    "properties": {
                        "networkPools": {
                            "NP-1": "{{pvdc_np_transport_zone}}"
                        "vcdExternalNetworks": {
                            "EN-1": {
                                "subnets": [
                                        "gateway": "",
                                        "prefixLength": 24,
                                        "dnsServer1": "",
                                        "ipRanges":  [
                                                "startAddress": "",
                                                "endAddress": ""
                                "description": "ExternalNetworkCreatedViaVCDBringup",
                                "tier0Name": "{{pvdc_ext_nw_tier0_gw}}"
                    "integrationId": "vcd-01-to-rmq-01",
                    "productType": "RMQ",
                    "hostname": "{{rmq_lb_name}}.{{domainName}}",
                    "port": "{{rmq_port_amqp_ssl}}",
                    "integrationUsername": "svc_vcd",
                    "integrationPassword": "{{password}}",
                    "properties": {
                        "amqpExchange": "systemExchange",
                        "amqpVHost": "/",
                        "amqpUseSSL": true,
                        "amqpSslAcceptAll": true,
                        "amqpPrefix": "vcd"
            "productType": "VCD",
            "productId": "{{vcd_product_id}}",
            "version": "10.1.2",
            "license": "{{vcd_license}}",
            "adminPassword": "{{password}}",
            "nodes": [
                    "hostName": "{{vcd_cell_1_name}}.{{domainName}}",
                    "vmName": "{{vcd_cell_1_name}}",
                    "rootPassword": "{{password}}",
                    "gateway": "{{vcd_mgmt_nw_gateway}}",
                    "nics": [
                            "ipAddress": "{{vcd_cell_1_ip}}",
                            "networkName": "vcd-dmz-nw",
                            "staticRoutes": []
                        }, {
                            "ipAddress": "{{vcd_cell_1_mgmt_ip}}",
                            "networkName": "vcd-mgmt-nw",
                            "staticRoutes": []
                    "hostName": "{{vcd_cell_2_name}}.{{domainName}}",
                    "vmName": "{{vcd_cell_2_name}}",
                    "rootPassword": "{{password}}",
                    "gateway": "{{vcd_mgmt_nw_gateway}}",
                    "nics": [
                            "ipAddress": "{{vcd_cell_2_ip}}",
                            "networkName": "vcd-dmz-nw",
                            "staticRoutes": []
                        }, {
                            "ipAddress": "{{vcd_cell_2_mgmt_ip}}",
                            "networkName": "vcd-mgmt-nw",
                            "staticRoutes": []
    "deploymentInfrastructures": {
        "infra1": {
            "vcenter": {
                "vcenterName": "mgmt-vc",
                "vcenterHost": "{{vcenter_hostname}}.{{domainName}}",
                "vcenterUsername": "administrator@vsphere.local",
                "vcenterPassword": "{{vc_password}}",
                "datacenterName": "{{vc_datacenter}}",
                "clusterName": "{{vc_cluster}}",
                "resourcePool": "{{vc_res_pool}}",
                "datastores": [
                "networks": {
                    "vcd-dmz-nw": {
                        "portGroupName": "{{vcd_dmz_portgroup}}",
                        "gateway": "{{vcd_dmz_gateway}}",
                        "subnetMask": "{{vcd_dmz_subnet}}",
                        "domainName": "{{domainName}}",
                        "searchPath": [
                        "useDhcp": false,
                        "dns": [
                        "ntp": [
                    "vcd-mgmt-nw": {
                        "portGroupName": "{{vcd_mgmt_nw_portgroup}}",
                        "gateway": "{{vcd_mgmt_nw_gateway}}",
                        "subnetMask": "{{vcd_mgmt_nw_subnet}}",
                        "useDhcp": false

The JSON payload structure is similar for other products. It starts with the environment definition and then follows with a specific product and its product type (VCD, RMQ, TenantApp, Usage Meter). Each has its own set of properties. Integrations section defines for example which tenant VC and NSX should be registered, RabbitMQ etc. Then follows the description of each node to be deployed while referring to Deployment Infrastructure section that is at the end of the JSON and describes the vSphere environent where the nodes can be deployed.

During the bring up the Lifecycle Manager will perform various set of tests and validations to see if the payload is correct and if the referenced environments are accessible. Then it will go on with the actual deployment process. For that it needs to have access to file repository of OVA images (for the bring up) or patch/upgrade files (for lifecycle). This must be manually downloaded to the Docker VM or mounted via NFS.

For the day 2 operations (certificate changes, node manipulations, etc.) an environment must first be imported (as mentioned before the Lifecycle Manager is stateless and forgets everything when rebooted). During the import the same payload as for deployment is provided and checks are performed that the actual environment matches the imported one. Once the state is in the container memory day 2 command can be run. And a six cell VMware Cloud Director deployment can be upgraded with a single API call!

The actual architecture of the deployment is quite flexible. The Lifecycle Manager itself does not prescribe or deploys any networks, load balancers or NFS shares. All those must be prepared up front. I have tested deployment on top of VMware Cloud Foundation 4 (see here) but that is not a hard requirement. Brown field environments are not supported, but nothing is really stopping you to try to describe your existing environment in the JSON and import it.

If you plan to deploy and manage VMware Cloud Director at scale give it a try. And as this is the first public release we have a lot to look forward in the future.

Recovering NSX-T Manager from File System Corruption

One of our labs had a temporary storage issue which left two NSX-T Managers (separate instances of NSX-T installation) in a corrupted state. Here are some steps you can take to attempt to revive the NSX-T Manager appliance back to life. BTW these steps might work for Edge Nodes as well.

The issue starts with the appliance having file system in read only mode. After reboot you will see a message:

The first step is to go into appliance GRUB menu that appears briefly after start up, hit e key, enter root/VMware1 GRUB credentials (these are different from the regular credentials) and edit the line with starting with linux and replace ro with rw and delete the rest of the line.

Continue with the boot process by pressing Ctrl+x. Hopefully now you are able to get into BusyBox shell and run fsck /dev/sda2 or similar to fix the corrupted partition. Reboot.

What can happen now is that the appliance will boot but again will find LVM corruption and will go into emergency mode and you can see repeated Login incorrect messages.

Repeat the process with the GRUB edit. This time you will be asked to enter root password to go into maintenance mode.

Type the root password and follow this KB article by typing fsck /dev/mapper/nsx-tmp command. Reboot again.

Hopefully now the appliance starts properly.

What can also happen is that your root password expired and you will not be able to enter the maintenance mode. Although the official documentation has a process how to reset it, the process will not work in this case. The workaround is again in the GRUB menu edit the linux line, replace ro with rw but then append init=/bin/bash. You should be able to get to the shell and reset your password with passwd command.

Good luck with the recovery and do not forget to set up backup and disable password expiration.

Google Authentication with VMware Cloud Director (OAuth)

Several authentication mechanisms can be used for VMware Cloud Director users. The basic authentication is used for local (users stored in VCD database) and LDAP users. SAML authentication can be used for integration with SAML compatible Identity Providers such as Microsoft AD FS, IBM Cloud Identity, VMware Workspace ONE Access (VIDM). OAuth authentication is supported as well, but due to the fact you have to (currently as of VCD 10.2) use API to configure it, it is not that widely known.

In this article I will show an example of such configuration with VMware Identity Manager (VIDM) and with Google Identity IdP. Yes, with VIDM you have the option to use SAML or OAuth.

By default OAuth authentication can be enabled by the tenant at Organizational level and co-exist with local, LDAP and SAML identity sources. The OAuth authentication endpoint must be reachable from VCD Cells. This is a big difference compared to SAML authentication, which is performed via assertion token exchange via browser (only the client browser needs to reach the SAML IdP). Therefore OAuth is more suitable when public IdPs are used (e.g. Google) or provider managed ones (VCD cells can reach IdP internally).

VMware Identity Manager OAuth Configuration

Note I am using VIDM version 3.3.

  1. In VIDM as admin go to Catalog, Settings, Remote App Access and create a new Client
  2. Create the client. Pick unique Client ID, the redirect URL is<org name> or Generate the shared secret and select Email, Profile, User and OpenID scopes.
  3. Now we need to find OAuth endpoints and public key. In my VIDM configuration this is can be found at This URL can differ based on VIDM / Workspace ONE Access version.
    The address returns a JSON response from which we need: issuer, authorization_endpoint, token_endpoint, userinfo_endpoint, scopes and claims supported.
    The link to the public key is provided in jwks_uri ( We will need the key in PEM format, so you can either convert it (e.g. or specify PEM format in  the link (&format=pem  at the end of the URI). We will also need KeyID (kid value) and key algorithm (kty).
  4. Now we have all necessary information to configure OAuth in VCD. We will use PUT /admin/org/{id}/settings/oauth API call. In the payload we will provide all data that we collected in steps #2 and #3. Here is an example I used:
    Note the OIDCAttributeMapping section. Here we must specify claims providing more information about the user. VIDM currently does not support groups and roles, so those are hardcoded. You can see what user information is sent by accessing UserInfoEndpoint. This can be done easily with Postman OAuth2 authentication, where you first obtain the Access Token (orange button) and then do a GET against the UserInfoEndpoint.
  5. Lastly we need to import some users. This is done with POST /admin/org/{id}/users API call with ProviderType set to OAUTH.

Now we can log in as the VIDM user.

Google Identity OAuth Configuration

  1. Head over to Credentials section of Google API & Services:
  2. Create Project, configure Consent Screen, Scopes and test users
  3. Create OAuth Client ID. Use the redirect URI<org name> or Note generated Client ID and secret.
  4. Google OAuth endpoints and public keys can be retrieved from:
    You will need to get both public keys and convert them to PEM. Now we can configure the OAUTH in VCD.
PUT https://{{host}}/api/admin/org/b813a16e-6821-4dc5-994f-955b10155107/settings/oauth

<OrgOAuthSettings xmlns=""                     type="application/vnd.vmware.admin.organizationOAuthSettings+xml">
            <Key>-----BEGIN PUBLIC KEY-----
-----END PUBLIC KEY-----</Key>
            <Key>-----BEGIN PUBLIC KEY-----
-----END PUBLIC KEY-----</Key>
    <Scope>email profile openid</Scope>
  • With the same API as described in the step 5 of the VIDM configuration import your OAuth users.

Provider Networking in VMware Cloud Director

This is going to be a bit longer than usual and more of a summary / design option type blog post where I want to discuss provider networking in VMware Cloud Director (VCD). By provider networking I mean the part that must be set up by the service provider and that is then consumed by tenants through their Org VDC networking and Org VDC Edge Gateways.

With the introduction of NSX-T we also need to dive into the differences between NSX-V and NSX-T integration in VCD.

Note: The article is applicable to VMware Cloud Director 10.2 release. Each VCD release is adding new network related functionality.

Provider Virtual Datacenters

Provider Virtual Datacenter (PVDC) is the main object that provides compute, networking and storage resources for tenant Organization Virtual Datacenters (Org VDCs). When a PVDC is created it is backed by vSphere clusters that should be prepared for NSX-V or NSX-T. Also during the PVDC creation the service provider must select which Network Pool is going to be used – VXLAN backed (NSX-V) or Geneve backed (NSX-T). PVDC thus can be backed by either NSX-V or NSX-T, not both at the same time or none at all and the backing cannot be changed after the fact.

Network Pool

Speaking of Network Pools – they are used to create on-demand routed/isolated networks by tenants. The Network Pools are independent from PVDCs, can be shared across multiple PVDCs (of the same backing type). There is an option to automatically create VXLAN network pool with PVDC creation but I would recommend against using that as you lose the ability to manage the transport zone backing the pool on your own. VLAN backed network pool can still be created but can be used only in PVDC backed by NSX-V (same for very legacy port group backed network pool now available only via API). Individual Org VDCs can (optionally) override the Network Pool assigned of its parent PVDC.

External Networks

Deploying virtual machines without the ability to connect to them via network is not that usefull. External networks are VCD objects that allow the Org VDC Edge Gateways connect to and thus reach the outside world – internet, dedicated direct connections or provider’s service area. External network have associated one or more subnets and IP pools that VCD manages and uses them to allocate external IP addresses to connected Org VDC Edge Gateways.

There is a major difference how external networks are created for NSX-V backed PVDCs and for NSX-T ones.

Port Group Backed External Network

As the name suggest these networks are backed by an existing vCenter port group (or multiple port groups) that must be created upfront and is usually backed by VLAN (but could be a VXLAN port group as well). These external networks are (currently) supported only in NSX-V backed PVDCs. Org VDC Edge Gateway connected to this network is represented by NSX-V Edge Service Gateway (ESG) with uplink in this port group. The uplinks have assigned IP address(es) of the allocated external IPs.

Directly connected Org VDC network connected to the external network can also be created (only by the provider) and VMs connected to such network have uplink in the port group.

Tier-0 Router Backed External Network

These networks are backed by an existing NSX-T Tier-0 Gateway or Tier-0 VRF (note that if you import to VCD Tier-0 VRF you can no longer import its parent Tier-0 and vice versa). The Tier-0/VRF must be created upfront by the provider with correct uplinks and routing configuration.

Only Org VDC Edge Gateways from NSX-T backed PVDC can be connected to such external network and they are going to be backed by a Tier-1 Gateway. The Tier-1 – Tier-0/VRF transit network is autoplumbed by NSX-T using subnet. The allocated external network IPs are not explicitly assigned to any Tier-1 interface. Instead when a service (NAT, VPN, Load Balancer) on the Org VDC Edge Gateway starts using assigned external address, it will be advertised by the Tier-1 GW to the linked Tier-0 GW.

There are two main design options for the Tier-0/VRF.

The recommended option is to configure BGP on the Tier-0/VRF uplinks with upstream physical routers. The uplinks are just redundant point-to-point transits. IPs assigned from any external network subnet will be automatically advertised (when used) via BGP upstream. When provider runs out of public IPs you just assign additional subnet. This makes this design very flexible, scalable and relatively simple.

Tier-0/VRF with BGP

An alternative is to use design that is similar to the NSX-V port group approach, where Tier-0 uplinks are directly connected to the external subnet port group. This can be useful when transitioning from NSX-V to T where there is a need to retain routability between NSX-V ESGs and NSX-T Tier-1 GWs on the same external network.

The picure below shows that the Tier-0/VRF has uplinks directly connected to the external network and a static route towards the internet. The Tier-0 will proxy ARP requests for external IPs that are allocated and used by connected Tier-1 GWs.

Tier-0 with Proxy ARP

The disadvantage of this option is that you waste public IP addresses for T0 uplink and router interfaces for each subnet you assign.

Note: Proxy ARP is supported only if the Tier-0/VRF is in Active/Standby mode.

Tenant Dedicated External Network

If the tenant requires direct link via MPLS or a similar technology this is accomplished by creating tenant dedicated external network. With NSX-V backed Org VDC this is represented by a dedicated VLAN backed port group, with NSX-T backed Org VDC it would be a dedicated Tier-0/VRF. Both will provide connectivity to the MPLS router. With NSX-V the ESG would run BGP, with NSX-T the BGP would have to be configured on the Tier-0. In VCD the NSX-T backed Org VDC Gateway can be explicitly enabled in the dedicated mode which gives the tenant (and also the provider) the ability to configure Tier-0 BGP.

There are seprate rights for BGP neighbor configuration and route advertisement so the provider can keep BGP neighbor configuration as provider managed setting.

Note that you can connect only one Org VDC Edge GW in the explicit dedicated mode. In case the tenant requires more Org VDC Edge GWs connected to the same (dedicated) Tier-0/VRF the provider will not enable the dedicated mode and instead will manage BGP directly in NSX-T (as a managed service).

Often used use case is when the provider directly connects Org VDC network to such dedicated external network without using Org VDC Edge GW. This is however currently not possible to do in NSX-T backed PVDC. There instead, you will have to import Org VDC network backed by NSX-T logical segment (overlay or VLAN).

Internet with MPLS

The last case I want to describe is when the tenant wants to access both Internet and MPLS via the same Org VDC Edge GW. In NSX-V backed Org VDC this is accomplished by attaching internet and dedicated external network portgroups to the ESG uplinks and leveraging static or dynamic routing there. In an NSX-T backed Org VDC the provider will have to provision Tier-0/VRF that has transit uplink both to MPLS and Internet. External (Internet) subnet will be assigned to this Tier-0/VRF with small IP Pool for IP allocation that should not clash with any other IP Pools.

If the tenant will have route advertisement right assigned then route filter should be set on the Tier-0/VRF uplinks to allow only the correct prefixes to be advertised towards the Internet or MPLS. The route filters can be done either in NSX-T direclty or in VCD (if the Tier-0 is explicitly dedicated).

The diagram below shows example of an Org VDC that has two Org VDC Edge GWs each having access to Internet and MPLS. Org VDC GW 1 is using static route to MPLS VPN B and also has MPLS transit network accessible as imported Org VDC network, while Org VDC GW 2 is using BGP to MPLS VPN A. Connectivity to the internet is provided by another layer of NSX-T Tier-0 GW which allows usage of overlay segmens as VRF uplinks and does not waste physical VLANs.

One comment on usage of NAT in such design. Usually the tenant wants to source NAT only towards the Internet but not to the MPLS. In NSX-V backed Org VDC Edge GW this is easily set on per uplink interface basis. However, that option is not possible on Tier-1 backed Org VDC Edge GW as it has only one transit towards Tier-0/VRF. Instead NO SNAT rule with destination must be used in conjunction with SNAT rule.

An example:

NO SNAT: internal destination
SNAT: internal translated

The above example will source NAT network only to the internet.