NSX L2 Bridging Options

I had recently multiple discussions about NSX and its Layer 2 bridging capabilities with various service providers. Let me summarize some important points and considerations when you would use which.

Why?

Let’s start with simple question – why would you need layer 2 bridging? Here are some use cases:

  • The end-user wants to burst their application to the cloud but wants to keep certain components on-site and because its legacy application it cannot be re-IP’d or requires single subnet communication.
  • The service provider is building new cloud offering next to legacy business (collocation, managed services) and wants to enable existing customers to migrate or extend their workloads to the cloud seamlessly (no IP address changes)

How?

NSX offers three ways how to bridge layer two networks.

Layer 2 VPN

This is proprietary VPN solution which enables to create encrypted tunnel across IP networks between Edge Service Gateways that stitches one or more L2 networks. These Edge Gateways can be deployed in different management domains and there is also option of deploying standalone Edge which does not require NSX license. This is great for the cloud bursting use case. I have blogged about L2VPN already in the past here.

While this option is very flexible it is also quite CPU intensive for both the L2 VPN Client and Server Edge VMs. This option provides up to 2Gb throughput.

NSX Native L2 Bridging

L2 bridge is created in the ESXi VMkernel hypervisor by deploying a Logical router control VM. The control VM is used only for the bridge configuration and its pinning to a particular ESXi host. As the bridging happens in the VMkernel it is possible to achieve impressive line rate (10 Gb) throughput.

It is possible to bridge only VXLAN based logical switch with VLAN based port group. The same physical uplink must be utilized so this means that the VLAN port group must be on the same vSphere Distributed Switch (vDS) that is prepared with the VXLAN VTEP and where the VXLAN logical switch portgroups are created.

L2Bridge

VLAN and VXLAN portgroups are on the same vDS

VLAN and VXLAN portgroups are on the same vDS

The above fact prohibits a scenario where you would have Edge Cluster with multiple pairs of uplinks connected to separate vDS switches. One for VLAN based traffic and the other for VXLAN traffic. You cannot create NSX native L2 bridge instance between two vDS switches.

This important especially for the collocation use case mentioned at the beginning. In order to use the L2 bridge the customer VLAN must be connected to the Edge Cluster Top of the Rack pair of switches.

If this is not possible, as a workaround the service provider can use the L2 VPN option – it is even possible to run both L2 VPN Server and Client Edges on the same host connected through a transit VXLAN network where one Edge is connected to trunk with VLAN networks from one vDS and the other to trunk with VXLAN networks on another vDS. Unfortunately this has performance impact (even if NULL-MD5 encryption is used) and should be used only for temporary migration use cases.

L2VPN interfaces

L2VPN

Hardware VTEP

The last bridging option discussed is a new feature of NSX  6.2. It is possible to extend the VXLAN logical switch all the way to a compatible hardware device (switch from a VMware partner) that acts as Layer 2 gateway and bridges the logical switch with a VLAN network. The device performing the function of hardware VTEP is managed from NSX via OVSDB protocol while the control plane is still managed by NSX Controllers. More details are in the following white paper.

As this option requires new dedicated and NSX compatible switching hardware it is more useful for the permanent use cases.

vCloud Networking and Security Upgrade to NSX in vCloud Director Environments

Just a short post to link a new whitepaper I wrote about upgrade of vCloud Networking and Security to NSX in vCloud Director Environment.

It discusses:

  • interoperability and upgrade path
  • impact of network virtualization technologies (Nexus 1000V,VCDNI)
  • migration considerations
  • migrations scenario with minimal production impact

VMware vCloud Director® relies on VMware vCloud® Networking and Security or VMware NSX® for vSphere® to provide abstraction of the networking services. Until now, both platforms could be used interchangeably because they both provide the same APIs that vCloud Director uses to provide networks and networking services.
The vCloud Networking and Security platform end-of-support (EOS) date is 19 September 2016. Only NSX for vSphere will be supported with vCloud Director after the vCloud Networking and Security end-of-support date.
To secure the highest level of support and compatibility going forward, all service providers should migrate from vCloud Networking and Security to NSX for vSphere. This document provides guidance and considerations to simplify the process and to understand the impact of changes to the environment.
NSX for vSphere provides a smooth, in-place upgrade from vCloud Networking and Security. The upgrade process is documented in the corresponding VMware NSX Upgrade Guides (versions 6.0 , v6.1 , 6.2 ). This document is not meant to replace these guides. Instead, it augments them with specific information that applies to the usage of vCloud Director in service provider environments.

read more

 

Reboot All Hosts in vCloud Director

Reboot RequiredvCloud Director based clouds support non-disruptive maintenance of the underlying physical hosts. They can be patched, upgraded or completely exchanged without any impact on the customer workloads all that thanks to vMotion and DRS Maintenance Mode which can evacuate all running, suspended or powered-off workloads from an ESXi host.

Many service providers are going to be upgrading their networking platform from vCloud Network and Security (vCNS) to NSX. This upgrade besides upgrading the Manager and deploying new NSX Controllers requires upgrade of all hosts with new NSX VIBs. This host upgrade results in the need to reboot every host in the service provider environment.

Depending on number of hosts, their size and vMotion network throughput evacuating each host can take 5-10 minutes and reboot can add additional 5 minutes. So for example sequential reboot of 200 hosts could result in full weekend long maintenance window. However, as I mentioned, these reboots can be done non-disruptively without any impact on customers – so no maintenance windows is necessary and no SLA is breached.

So how do you properly reboot all hosts in vCloud Director environment?

While vSphere maintenance mode helps, it is important to properly coordinate it with vCloud Director.

  • Before a host is put into a vSphere maintenance mode it should be disabled in vCloud Director which will make sure it does not try to communicate with the host for example for image uploads.
  • All workloads (not just running VMs) must be evacuated during the maintenance mode. A customer who decides to power on VM or clone a VM which is registered to a rebooting (and temporarily unavailable) host would be otherwise impacted.

So here is the correct process (omitting the parts that actually lead to the need to reboot the hosts):

  1. Make sure that cluster has enough capacity to temporarily run without 1 host (it is very common to have atleast N+1 HA redundancy)
  2. Disable host in vCloud Director
  3. Put host into vSphere maintenance mode while evacuating all running, suspended and powered-off VMs
  4. Reboot host
  5. When hosts comes up exit the maintenance mode
  6. Enable host
  7. Repeat with other hosts

As a quick proof of concept I am attaching a PowerCLI script that automates this. It needs to talk to both vCloud Director and vCenter Server therefore replace Connect strings at the beginning to match your environment.

## Connect to vCloud Director and all vCenter Servers it manages
Connect-CIServer -Server vcloud.gcp.local -User Administrator -Password VMware1!
Connect-VIServer -Server vcenter.gcp.local -User Administrator -Password VMware1!

$ESXiHosts = Search-cloud -QueryType Host
foreach ($ESXiHost in $ESXiHosts) {
	$CloudHost = Get-CIView -SearchResult $ESXiHost
	Write-Host
	Write-Host "Working on host" $CloudHost.Name
	Write-Host "Disabling host in vCloud Director"
	$CloudHost.Disable()
	Write-Host "Evacuating host"
	Set-VMHost $CloudHost.Name -State Maintenance -Evacuate | Out-Null
	Write-Host "Rebooting host"
	Restart-VMHost $CloudHost.Name -Confirm:$false | Out-Null
    Write-Host -NoNewline "Waiting for host to come online "
    do {
		sleep 15
		$HostState = (get-vmhost $CloudHost.Name).ConnectionState
		Write-Host -NoNewline "."
    }
    while ($HostState -ne "NotResponding")
    do {
		sleep 15
		$HostState = (get-vmhost $CloudHost.Name).ConnectionState
		Write-Host -NoNewline "."
    }
	while ($HostState -ne "Maintenance")
	Write-Host
	Write-Host "Host rebooted"
	Set-VMHost $CloudHost.Name -State Connected | Out-Null
	Write-Host "Enabling Host in vCloud Director"
	$CloudHost.Enable()
}

PowerCLI output

Unattended Installation of vCloud Director

In vCloud Director 8.0 many enhancements were made to enable unattended installation. This is useful to eliminate manual steps to speed up installation process as well as ensure identical configuration among multiple vCloud Director instances.

Let’s say the provider needs to deploy multiple vCloud Director instances each consisting of multiple cells. Here is the process in high level steps.

Preparation of base template

  • Create Linux VM with supported RHEL/CentOS distribution.
  • Upload vCloud Director binaries to the VM (e.g. vmware-vcloud-director-8.0.0-3017494.bin)
  • Execute the installation file without running the configure script

Prerequisites for each vCloud Director Instance

The following must be prepared for each vCloud Director instance <N>:

  • Create database:
    • DB name: vcloudN
    • DB user: vcloudN
    • DB password: VMware1!
  • Prepare NFS transfer share
  • Create DNS entries, load balancer and corresponding signed certificates for http and consoleproxy and save them to a keystore file certificates.ks. In my example I am using keystore password passwd.

Unattended Installation of the First Cell

  • Deploy base template and assign 2 front-end IP addresses. These must match load balancer configuration. e.g. 10.0.2.98, 10.0.2.99
  • Mount NFS transfer share to /opt/vmware/vcloud-director/data/transfer
  • Upload certificates to /opt/vmware/vcloud-director/etc/certificates.ks
  • Run configure script – notice the piping of “Yes” answer to start VCD service after the configuration:
    echo "Y" | /opt/vmware/vcloud-director/bin/configure -cons 10.0.2.98 -ip 10.0.2.99 -dbhost 10.0.4.195 -dbport 1433 -dbtype sqlserver -dbinstance MSSQLSERVER -dbname vcloudN -dbuser vcloudN -dbpassword 'VMware1!' -k /opt/vmware/vcloud-director/etc/certificates.ks -w passwd -loghost 10.0.4.211 -logport 514 -g -unattended

    where 10.0.4.195 is IP address of my MS SQL DB server and 10.0.4.211 syslog server.

  • Store /opt/vmware/vcloud-director/etc/responses.properties file created by the initial configuration in a safe place.
  • Run initial configuration to create instance ID and system administrator credentials:
    /opt/vmware/vcloud-director/bin/cell-management-tool initial-config --email vcloudN@vmware.com --fullname Administrator --installationid N --password VMware1! --systemname vCloudN --unattended --user administrator
    where N is the installation ID.

Unattended Installation of Additional Cells

vCloud cells are stateless, all necessary information is in vCloud database. All we need is responses.properties file from the first cell that contains necessary encrypted information how to connect to the database.

  • Deploy base template and assign 2 front-end IP addresses. These must match load balancer configuration. e.g. 10.0.2.96, 10.0.2.97
  • Mount NFS transfer share to /opt/vmware/vcloud-director/data/transfer
  • Upload certificates to /opt/vmware/vcloud-director/etc/certificates.ks
  • Upload responses.properties file to /opt/vmware/vcloud-director/etc/responses.properties
  • Run configure script – notice the piping of “Yes” answer to start VCD service after the configuration:
    echo "Y" | /opt/vmware/vcloud-director/bin/configure -r /opt/vmware/vcloud-director/etc/responses.properties -cons 10.0.2.96 -ip 10.0.2.97 -k /opt/vmware/vcloud-director/etc/certificates.ks -w passwd -unattended

Additional configurations from now on can be done via vCloud API.

vCloud Director IPv6 Support

With depletion of public IPv4 addresses service providers are starting to consider offering IPv6 addresses for their tenants workloads. Let me describe what are the options related to IPv6 support for service providers that use vCloud Director.

In the new vCloud Architecture Toolkit (vCAT) Document for Service Providers I have proposed a design how to provide IPv6 services to tenants. So let me summarize the constraints:

  • Currently in vCloud Director the tenants cannot assign IPv6 subnets to Org VDC or vApp networks
  • In consequence this means that the tenants cannot use vCloud Director IP Address Management (IPAM) to assign IPv6 addresses to their VMs. However, IPv6 addresses can still be assigned from within the guest operating system.
  • vCloud Director deployed Edge Gateways do not support IPv6. It means internal or external interfaces of the Edges need to have IPv4 addresses.
  • vCloud Director relies for network services on vCloud Networking and Security (vCNS) or NSX components. vCNS does not support IPv6 however NSX does. vCNS will soon go out of support anyway.

The proposed design that works around the above limitations is following. Let me paste the picture from the linked vCAT document:

Provider managed tenant Edge GW

The provider deploys NSX Edge Service Gateway outside of vCloud Director (directly from NSX GUI/API) and connects it to a VXLAN or VLAN based network which is then imported to vCloud Director as an external network. Both the Edge Gateway and the external networks are dedicated to a particular tenant and managed by the provider.

The tenant can attach his workloads to an OrgVDC network which is directly connected to the external network. As this tenant NSX Edge is managed externally outside of vCloud Director scope it can offer full set of services NSX provides – and among them are IPv6 services.

There is one undocumented cool feature that I recently discovered which enables even more IPv6 functionality to the tenant.

There is in fact the possibility for service provider to assign IPv6 subnet to the external network and thus the tenant can use vCloud Director IPAM in a limited way. He can manually assign IPv6 address (IP Mode Static – Manual) to a VM network interface from vCloud Director UI/API and let vCloud Director to configure the VM networking through guest customization. vCloud DIrector even makes sure the IP address is unique.

Note: IP Mode Static – IP Pool is not supported as it is not possible to define IPv6 IP pool.

Here is how to configure IPv6 subnet on external network:

  1. Create vCloud DIrector external network (with IPv4 subnet)
  2. Find vCloud UUID of the external network. For example use the following API call: GET /api/admin/extension/externalNetworkReferences
  3. Insert into vCloud Director database gateway, prefix length, nameservers and dns suffix information. You must create new entries in config table with the following values:

    cat = network
    name = ipv6.<ext network UUID>.gateway | subnetprefixlength | nameserver1 | nameserver2 | dnssuffix
    value = <value of the network property>

    The following example is valid for MS SQL database:

    external network UUID: 85f22674-7419-4e44-b48d-9210723a8e64
    subnet: fd6a:32b6:ab90::/64
    gateway IPv6 address: fd6a:32b6:ab90::1
    DNS 1: fd13:5905:f858:e502::208
    DNS 2: fd13:5905:f858:e502::209
    dns suffix: acme.fojta.com


    INSERT into config values ('network', 'ipv6.85f22674-7419-4e44-b48d-9210723a8e64.dnssuffix', 'acme.fojta.com', 0);
    INSERT into config values ('network', 'ipv6.85f22674-7419-4e44-b48d-9210723a8e64.nameserver1', 'fd13:5905:f858:e502::208', 0);
    INSERT into config values ('network', 'ipv6.85f22674-7419-4e44-b48d-9210723a8e64.nameserver2', 'fd13:5905:f858:e502::209', 0);
    INSERT into config values ('network', 'ipv6.85f22674-7419-4e44-b48d-9210723a8e64.subnetprefixlength', '64', 0);
    INSERT into config values ('network', 'ipv6.85f22674-7419-4e44-b48d-9210723a8e64.gateway', 'fd6a:32b6:ab90::1', 0);
  4. In the tenant Org VDC create Org VDC network directly connected to the external network.
  5. The tenant can now connect VMs to the Org VDC network and assign IPv6 addresses directly from UI (or API).
    Deploy template with IPv6

    Deploy template with IPv6

    VMs with IPv6 Addresses

    VMs with IPv6 Addresses

Note that when using this provider managed Edge Gateway concept, the external network is dedicated to a particular tenant. For scalability reasons it is recommended to use VXLAN based external networks created directly in NSX. vCloud Director supports maximum of 750 external networks.

The tenant cannot directly manage Edge Gateway services and must rely on the provider to configure them.