I feel like it is time for another update on VMware Cloud Director (VCD) capabilities regarding establishing L2 VPN between on-prem location and Org VDC. The previous blog posts were written in 2015 and 2018 and do not reflect changes related to usage of NSX-T as the underlying cloud network platform.
The primary use case for L2 VPN to the cloud is migration of workloads to the cloud when the L2 VPN tunnel is temporarily established until migration of all VMs on single network is done. The secondary use case is Disaster Recovery but I feel that running L2 VPN permanently is not the right approach.
But that is not the topic of today’s post. VCD does support setting up L2 VPN on tenant’s Org VDC Gateway (Tier-1 GW) from version 10.2 however still it is hidden, API-only feature (the GUI is finally coming soon … in VCD 10.3.1). The actual set up is not trivial as the underlying NSX-T technology requires first IPSec VPN tunnel to be established to secure the L2 VPN client to server communication. VMware Cloud Director Availability (VCDA) version 4.2 is an addon disaster recovery and migration solution for tenant workloads on top of VCD and it simplifies the set up of both the server (cloud) and client (on-prem) L2 VPN endpoints from its own UI. To reiterate, VCDA is not needed to set up L2 VPN, but it makes it much easier.
The screenshot above shows the VCDA UI plugin embeded in the VCD portal. You can see three L2 VPN session has been created on VDC Gateway GW1 (NSX-T Tier-1 backed) in ACME-PAYG Org VDC. Each session uses different L2 PVN client endpoint type.
The on-prem client can be existing NSX-T tier-0 or tier-1 GW, NSX-T autonomous edge or standalone Edge client. And each requires different type of configuration, so let me discuss each separately.
NSX-T Tier-0 or Tier-1 Gateway
This is mostly suitable for tenants who are running existing NSX-T environment on-prem. They will need to set up both IPSec and L2VPN tunnels directly in NSX-T Manager and is the most complicated process of the three options. On either Tier-0 or Tier-1 GW they will first need to set up IPSec VPN and L2 VPN client services, then the L2VPN session must be created with local and remote endpoint IPs and Peer Code that must be retrieved before via VCD API (it is not available in VCDA UI, but will be available in VCD UI in 10.3.1 or newer). The peer code contains all necessary configuration for the parent IPSec session in Base64 encoding.
Standalone Edge Client
This option leverages the very light (150 MB) OVA appliance that can be downloaded from NSX-T download website and actually works both with NSX-V and NSX-T L2 VPN server endpoints. It does not require any NSX installation. It provides no UI and its configuration must be done at the time of deployment via OVF parameters. Again the peer code must be provided.
This is the prefered option for non-NSX environments. Autonomous edge is a regular NSX-T edge node that is deployed from OVA, but is not connected to NSX-T Manager. During the OVA deployment Is Autonomous Edge checkbox must be checked. It provides its own UI and much better performance and configurability. Additionally the client tunnel configuration can be done via the VCDA on-premises appliance UI. You just need to deploy the autonomous edge appliance and VCDA will discover it and let you manage it from then via its UI.
This option requires no need to retrieve the Peer Code as the VCDA plugin will retrieve all necessary information from the cloud site.
After successful migration from NSX-V to NSX-T in VMware Cloud Director you might wish to unregister NSX-V Manager and completely delete it from the vCenter. This is not so easy as the whole VCD model was build on the assumption that vCenter Server and NSX-V Manager are tied together and thus retired together as well. This is obviously no longer the case as you can now use NSX-T backed PVDCs or not use NSX at all (new feature as of VCD 10.3).
VMware Cloud Director adds API support to unregister NSX-V Manager without removing the vCenter Server. To do so you need to use the OpenAPI PUT VirtualCenter call. You will first have to run GET call with the VC URN to retrieve its current configuration payload, remove the nsxVManager element and then PUT it back.
In order for the NSX-V Manager removal to succeed you must make sure that:
Org VDCs using the vCenter Server do not have any NSX-V objects (VXLAN networks, Edge Gateways, vApp or DHCP Edges)
Org VDCs using the vCenter Server do not use VXLAN network pool
There is no VXLAN network pool managed by the to-be-removed NSX-V Manager
If all above is satistfied you will not need to remove existing Provider VDCs (if they were in the past using NSX-V). They will become NSX-less (so you will not be able to use NSX-T objects in them). NSX-T backed PVDCs will not be affected at all.
The previous VMware Cloud Director 10.2 release brought many new networking features, the current one 10.3 continuous in the same fashion. Let me give you a brief run down.
The UI has been enhanced to surface formerly API only features such as the ability to configure dual stack IPv4/IPv6 networks:
or configure DHCP in gateway or network mode:
The service provider can now assign/change primary IP address of Org VDC Edge Gateway in the UI:
It is also possible to configure (extend) an external network port group backing without using API.
New NSX-T Backed Provider VDC Features
As NSX-T backed PVDCs now support both Tier-0/VRF and port group backing for external networks, to avoid confusion the Tier-0/VRF GWs were separated into its own tab.
The port group backed external networks can be either traditional VDS port groups, or NSX-T segments. The latter option gives the ability to use NSX-T distributed firewall on such external network (provider managed directly in NSX-T).
Distributed Firewall now supports dynamic groups that can be defined utilizing VM Tag or VM name.
vApps support routed vApp networks including DHCP service on vApp isolated networks. This is achieved by deploying standalone Tier-1 GWs that are connected to Org VDC networks via service interface. The Org VDC network must be overlay backed (not VLAN). vApp fencing is still not supported as NSX-T does not provide this functionality.
A few additional small enhancements ranging from support for Guest VLAN tagging, reflexive NAT to DHCP pool management.
Provider VDC with no NSX
The creation of Provider VDC does not require network pool specification anymore. Such PVDC will thus not provide any NSX-V or T features (routing, DHCP, firewalling, load balancing). The Org VDC network can than be backed by VLAN network pool or use VDS backed imported direct networks.
NSX-V vs NSX-T Feature Parity
Let me conclude with traditional NSX-V / NSX-T VCD feature comparison chart (new updates highlighted in green).
The tool’s main purpose is to automate migration of VMware Cloud Director Organization Virtual Data Centers that are NSX-V backed to a NSX-T backed Provider Virtual Data Center. The original article describes how exactly it is accomplished and what is the impact of migrated workloads from the networking and compute perspective.
The migration tool is continually developed and additional features are added to either enhance its usability (improved roll back, simplified L2 bridging setup) or to support more use cases based on new features in VMware Cloud Director (VCD). And then there is a new assessment mode! Let me go into more details.
Directly Connected Networks
The VCD release 10.2.2 added support to use in NSX-T backed Org VDCs directly connected Organization VDC networks. Such networks are not connected to a VDC Gateway and instead are just connected directly to a port group backed external network. The typical usage is for service networks, backup networks or colocation/MPLS networks where routing via the VDC Gateway is not desired.
The migration tool now supports migration of these networks. Let me describe how it is done.
The VCD external network in NSX-V backed PVDC is port group backed. It can be backed by one or more port groups that are typically manually created VLAN port groups in vCenter Server or they can also be VXLAN backed (system admin would create NSX-V logical switch directly in NSX-V and then use its DVS port groups for the external network). The system administrator then can create in the Org VDC a directly connected network that is connected to this external network. It inherits its parent’s IPAM (subnet, IP pools) and when tenant connects a VM to it it is just wired to the backing port group.
The migration tool first detects if the migrated Org VDC direct network is connected to an external network that is also used by other VDCs and based on that behaves differently.
Colocation / MPLS use case
If the external network is not used by any other Org VDC and the backing port group(s) is VLAN type (if more port groups are used they must have the same VLAN), then it will create in NSX-T logical segment in VLAN transport zone (specified in the YAML input spec) and import it to the target Org VDC as imported network. The reason why direct connection to external network is not used is to limit the external network sprawl as the import network feature perfectly matches the original use case intent. After the migration the source external network is not removed automatically and the system administrator should clean them up including the vCenter backing port groups at their convenience.
Note that no bridging is performed between the source and target network as it is expected the VLAN is trunked across source and target environments.
The diagram below shows the source Org VDC on the left and the target one on the right.
Service Network Use Case
If the external network is used by other Org VDCs, the import VLAN segment method cannot be used as each imported Org VDC network must be backed by its own logical segment and has its own IPAM (subnet, pool). In this case the tool will just create directly connected Org VDC network in the target VDC connected to the same external network as the source. This requires that the external network is scoped to the target PVDC – if the target PVDC is using different virtual switch you will need first to create regular VLAN backed port group there and then add it to the external network (API only currently). Also only VLAN backed port group can be used as no bridging is performed for such networks.
The other big feature is the assessment mode. The main driver for this feature is to enable service providers to see how much ready their environment is for the NSX-V to T migration and how much redisign will be needed. The assessment can be triggered against VCD 10.0, 10.1 or 10.2 environments and only requires VCD API access (the environment does not yet need to be prepared for NSX-T).
The tool will during the assessment check all or specified subset of NSX-V backed Org VDCs and assess every feature used there that impacts its migration viability. Then it will provide detailed and summarized report where you can see what ratio of the environment *could* be migrated (once upgraded to the latest VCD 10.2.2). This is provided in Org VDC, VM and used RAM units.
The picture below shows example of the summary report:
Note that if there is one vApp in a particular Org VDC that cannot be migrated, the whole Org VDC is counted as not possible to migrate (in all metrics VM and RAM). Some features are categorized as blocking – they are simple not supported by either NSX-T backed Org VDC or the migration tool (yet), but some issues can be mitigated/fixed (see the remediation recommendations in the user guide).
As mentioned the migration tool is continuosly developed and improved. Together with the next VMware Cloud Director version we can expect additional coverage of currently unsupported features. Especially the shared network support is high on the radar.
VMware Cloud Provider Lifecycle Manager is a new product just released in version 1.1. The version 1.0 was not generaly available and thus not widely known. Let me therefore briefly describe what it is and what it can do.
As the name indicates its main goal is to simplify deployment and lifecycle of VMware’s Cloud Provider solutions. Currently in scope are:
VMware Cloud Director (10.1.x or 10.2.x)
Usage Meter (4.3 and 4.4)
vRealize Operations Tenant App (2.4 and 2.5)
RabbitMQ (Bitnami based)
The product itself ships as a stateless Docker image that can be deployed as a container for example in Photon OS VM. It has no GUI, but provides REST API. The API calls support the following actions:
Deployment of an environment that can consist of one or more products (VCD, UM, …)
The whole environment (or its product subset) is described in JSON format that is supplied in the API payload. The example below shows payload to deploy VCD with three cells, includes necessary certificates, target vSphere environment and integration with vSphere, NSX-T and RabbitMQ including creation of Provider VDC.
The JSON payload structure is similar for other products. It starts with the environment definition and then follows with a specific product and its product type (VCD, RMQ, TenantApp, Usage Meter). Each has its own set of properties. Integrations section defines for example which tenant VC and NSX should be registered, RabbitMQ etc. Then follows the description of each node to be deployed while referring to Deployment Infrastructure section that is at the end of the JSON and describes the vSphere environent where the nodes can be deployed.
During the bring up the Lifecycle Manager will perform various set of tests and validations to see if the payload is correct and if the referenced environments are accessible. Then it will go on with the actual deployment process. For that it needs to have access to file repository of OVA images (for the bring up) or patch/upgrade files (for lifecycle). This must be manually downloaded to the Docker VM or mounted via NFS.
For the day 2 operations (certificate changes, node manipulations, etc.) an environment must first be imported (as mentioned before the Lifecycle Manager is stateless and forgets everything when rebooted). During the import the same payload as for deployment is provided and checks are performed that the actual environment matches the imported one. Once the state is in the container memory day 2 command can be run. And a six cell VMware Cloud Director deployment can be upgraded with a single API call!
The actual architecture of the deployment is quite flexible. The Lifecycle Manager itself does not prescribe or deploys any networks, load balancers or NFS shares. All those must be prepared up front. I have tested deployment on top of VMware Cloud Foundation 4 (see here) but that is not a hard requirement. Brown field environments are not supported, but nothing is really stopping you to try to describe your existing environment in the JSON and import it.
If you plan to deploy and manage VMware Cloud Director at scale give it a try. And as this is the first public release we have a lot to look forward in the future.