vCloud Director 9: NSX Distributed Logical Router

vCloud Director version 9 introduces support for the last major missing NSX feature – the distributed logical router (DLR). DLR provides optimized router which in distributed fashion performs routing between different logical switches in the hypervisor. The routing always happens in the hypervisor running the source VM which means that the traffic goes between maximum two ESXi hosts (source and destination) and no tromboning through third host running router VM is necessary. Read here for technical deep dive into how this works. This not only provides much better performance than traditional Edge GW routing, but also scales up to 1000 routed logical networks (as opposed to 10 on Edge GW or up to 209 if trunk port is enabled).

Generally, DLR should be used for routing only between VXLAN based logical switches, although NSX supports VLANs networks with certain caveats as well. Additionally dynamic routing protocols are supported as well and managed by Control VM of the DLR.

Now let’s look how vCloud Director implements DLR. The main focus was making DLR very simple to use and seamlessly integrate with the existing networking Org VDC concepts.

  • DLR is enabled on Org VDC Edge Gateway which must be already converted to advanced networking. You cannot use DLR without Org VDC Edge Gateway! There must be one free interface on the Edge (you will see later on why).
  • Once DLR is enabled, a logical DLR instance is created in NSX in headless mode without DLR Control VM (the instance is named in NSX vse-dlr-<GW name) (<UUID>)). vCloud Director can get away without Control VM as dynamic routing is not necessary – see later below.
  • The DLR instance uplink interface is connected to the Org VDC Edge GW with P2P connection using 10.255.255.248/30 subnet. The DLR uses .250 IP address and the Org VDC Edge GW uses .249. This subnet is hardcoded and cannot overlap with existing Org VDC Edge GW subnets. Obviously the Org VDC Edge GW needs at least one free interface.
  • DLR has default gateway set to the Org VDC Edge GW interface (10.255.255.249)
  • New Org VDC networks now can be created in the Org VDC with the choice to attach them to the Edge Gateway (as regular or subinterface in a trunk) or to attach them to the DLR instance.
    For each distributed Org VDC network a static route will be created on the Org VDC Edge Gateway to point to the DLR uplink interface. This means there is no need for dynamic routing protocols on the DLR instance.

    Static Routes on NSX Edge GW

In the diagram below is the networking topology of such setup.

In the example you can see three Org VDC networks. One (blue) traditional (10.10.10.0/24) attached directly to the Org VDC Edge GW and two (purple and orange) distributed (192.168.0.0/24 and 192.168.1.0/24) connected through the DLR instance. The P2P connection between Org VDC Edge GW and DLR instance is green.

  • DHCP relay agents are automatically configured on DLR instance for each distributed Org VDC network and point to DHCP Relay Server – the Org VDC Edge GW interface (10.255.255.249). To enable DHCP service for particular distributed Org VDC network, the DHCP Pool with proper IP Range just needs to be manually created on the Org VDC Edge Gateway. If Auto Configure DNS is enabled, DHCP will provide IP address of the Org VDC Edge P2P interface to the DLR instance.

    DHCP Configuration of DLR pools on the Edge GW

Considerations

  • Up to 1000 distributed Org VDC networks can be connected to one Org VDC Edge GW (one DLR instance per Org VDC Edge GW).
  • Some networking features (such as L2 VPN) are not supported on the distributed Org VDC networks.
  • VLAN based Org VDC networks cannot be distributed. The Org VDC must use VXLAN network pool.
  • IPv6 is not supported by DLR
  • vApp routed networks cannot be distributed
  • The tenant can override the automatic DHCP and static route configurations done by vCloud Director for distributed networks on the Org VDC Edge GW. The tenant cannot modify the P2P connection between the Edge and DLR instance.
  • Disabling DLR on Org VDC Edge Gateways is possible but all distributed networks must be removed before.
  • Both enabling and disabling DLR on Org VDC Edge Gateway are by default system administrator only operations. It is possible to grant these rights to a tenant with the granular RBAC introduced in vCloud Director 8.20.
  • DLR feature is in the base NSX license in the VMware Cloud Provider Program.

Edit 02/10/2017: Engineering (Abhinav Mishra) provided a way how to change P2P subnet between the Edge and DLR. Add the following property value with CMT:

$VCLOUD_HOME/bin/cell-management-tool manage-config -n gateway.dlr.default.subnet.cidr -v <subnet CIDR>

Example: $VCLOUD_HOME/bin/cell-management-tool manage-config -n gateway.dlr.default.subnet.cidr -v 169.254.255.248/30

No need for cell reboot.

Edit 03/10/2017: Existing Org VDC networks can be migrated from traditional to DLR or sub-interface based networks in all directions in non-disruptive way with running VMs attached.

 

Advertisements

vRealize Operations Management Pack for NSX-V and Log Insight Integration

Quick post about an issue I discovered in my lab during upgrade to NSX 6.3.3. This particular NSX version has a silent new feature that verifies if syslog configuration on Edges is correct. If the syslog entry is incorrect (it is not an IP address or FQDN with at least one dot character or does not have TCP/UDP protocol specified) it will not let you save it. This however also means that older Edges (with version 6.3.2 or older) that have incorrect syslog setting will fail to be upgraded as the incorrect config will not be accepted.

So how does it relate to the title of the article? If you have vROps in your environment with NSX-V management pack and you have enabled Log Insight integration, the Management Pack will configure syslog on all NSX components. Unfortunately in my case it configures them incorrectly with only hostname and no protocol. This reconfiguration happens roughly every hour. This might be especially annoying in vCloud Director environment where all the Edges are initially deployed with syslog setting specified by VCD, but then are changed within an hour by vROps to something different.

Anyway, the remediation is simple. Disable the Log Insight integration of the vROps NSX Management Pack as shown on the picture below.

Org VDC Edge Gateway CPU/RAM Reservations

vCloud Director 8.20 allows deployment of Org VDC Edge Gateways in 4 different form factors from Compact to X-Large where each provides different level of performance and consumes different amount of resources.

As these Edge Gateways are deployed by NSX Manager which allows setting custom reservations for CPU and RAM via an API call PUT https://<NSXManager>/api/4.0/edgePublish/tuningConfiguration, it is also possible in vCloud Director to set custom reservations.

Why would you change the default reservations? Reservations at VM (Edge) level reserve the resources for itself which means no other VM can utilize them in case they are unused. They basically guarantee certain level of service that the VM (Edge) from performance perspective will always deliver. In service provider environments oversubscription provides ROI benefits and if the service provider can guarantee enough resources at cluster scale, than the VM level reservations can be set lower if at all.

This can be accomplished by tuning the networking.gatewayMemoryReservationMultiplier and networking.gatewayCpuReservationMultiplier settings via cell-management-tool from vCloud Director cell. By default the CPU multiplier is set to 64 MHz per vCPU and the Memory multiplier to 0.5.

By default Edge Gateways will be deployed with the following reservation settings:

Org VDC Edge GW Default Resource Reservations

The following command will change memory multiplier to 10%:

/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n networking.gatewayMemoryReservationMultiplier -v 0.1

Note: The new reservation settings are applicable only for newly deployed Org VDC Edge Gateways. Redeploying existing edges will not change their reservation settings. You must either use NSX API to do so, or modify Org VDC Edge Gateway form factor (e.g. change Large to Compact and then back to Large) which is not so elegant as it will basically redeploy the Edge twice.

Also note that NSX 6.2 and NSX 6.3 have different sizing of Quad Large Edge. vCloud Director 8.20 is by default set for the NSX 6.3 size which is 2 GB RAM (as opposed to NSX 6.2 value of 1 GB RAM). It is possible to change the default for the reservation calculation by editing networking.full4GatewayMemoryMb setting to value ‘1024’

Setup Site-to-Site VPN between Azure and vCloud Director

My previous blog post was about setting up IPSec VPN tunnel between AWS VPC and vCloud Director Org VDC. This time I will describe how to achieve the same with Microsoft Azure.

vCloud Director is not among Azure list of supported IPSec VPN endpoints however it is possible to set up such VPN although it is not straightforward.

I will describe the setup of both Azure and VCD endpoints very briefly as it is very similar to the one I described in my previous article.

Azure Configuration

  • Resource Group (logical container object) – in my example RG UK
  • Virtual network (large address space similar to AWS VPN subnet) – 172.30.0.0/16
  • Subnets – at least one for VMs (172.30.0.0/24) and one for Gateway (172.30.255.0/29)
  • Virtual Network Gateway – Azure VPN endpoint with public IP address associated with the Gateway subnet above. Gateway type is VPN, VPN type is Policy-based (this is because Route-based type uses IKE2 which is not supported by NSX platform used by vCloud Director).
  • Local Network Gateway – vCloud VPN endpoint definition with its public IP address and subnets that should be reachable behind the vCloud VPN endpoint (81.x.x.x, 192.168.100.0/24)
  • Connection – definition of the tunnel:
    • Connection type: Site-to-site (IPSec)
    • Virtual network gateway and local network gateway are straightforward (those created previously)
    • Connection name: whatever
    • Shared Key (PSK): create your own 32+ character key using upper and lower case characters and numbers
  • Test VM connected to the VM subnet (IP 172.30.0.4)

azure-resources

vCloud Configuration

As explained above we created Policy Based VPN endpoint in Azure. Policy Based VPN uses IKE version 1, Diffie-Hellman Group 2 and no Perfect Forward Secrecy.

However selection of DH group and PFS is not available to tenant in vCloud Director on the legacy Org VDC Edge Gateway. Therefore the following workaround is proposed:

Tenant configures VPN on his Org VDC Edge Gateway with the following:

  • Name: Azure
  • Enable this VPN configuration
  • Establisth VPN to: a remote network
  • Local Networks: 192.168.100.0/24 (Org VDC network(s))
  • Peer Networks: 172.30.0.0/24
  • Local Endpoint: Internet (interface facing internet)
  • Local ID: 10.0.2.121 (Org VDC Edge GW internet interface)
  • Peer ID: 51.x.x.x (public IP of the Azure Virtual network gateway)
  • Peer IP: 51.x.x.x (same as previous)
  • Encryption protocol: AES256
  • Shared Key: the same as in Azure Connection definition

Now we need to ask the service provider to directly in NSX in the Edge VPN configuration disable PFS and change DH Group to DH2.

nsx-vpn

Note that this workaround is not necessary on Org VDC Edge Gateway that has been enabled with Advanced Networking services. This feature is at the moment only in vCloud Air, however soon will be available to all vCloud Air Network service providers.

If all firewall rules are properly set up we should be able to ping between Azure and vCloud VMs.

ping

Setup Site-to-Site VPN between AWS and vCloud Director

In today’s reality of multi cloud world customers are asking how to set up connection between them. In this article I am going to demonstrate how to set up IPsec VPN tunnel between AWS VPC and vCloud Director Org VDC.

IPSec is standard protocol suite which works at OSI Layer 3 and allows encrypting IP packet communication. It is supported by many software, hardware and cloud vendor implementations, however it is also quite complex to set up due to large sets of different settings which both tunnel endpoints must support. Additionally as it does not rely on TCP L4 protocol NAT traversal can be a challenge.

In my example I am using my home lab vCloud Director instance running behind NATed internet connection. So what could go wrong 🙂

The diagram below shows the set up.

 

AWS Virtual Private Cloud on the left is created with large subnet 172.31.0.0/16, a few instances, and Internet and VPN gateways.

On the right is vCloud Director Org VDC with a network 192.168.100.0/24 behind an Org VDC Edge Gateway which is connected to the Internet via my home ADSL router.

    1. We start by taking care of IPSec NAT traversal over the ADSL router. As I have dd-wrt OS on the router, I am showing how I enabled port forwarding of UDP ports 500 and 4500 to the Edge GW IP 10.0.2.121 and added DNAT for protocols 50 (AH) and 51 (ESP) to the router startup script.
      udp-port-forwardingiptables -t nat -A PREROUTING -p 50 -j DNAT –to 10.0.2.121
      iptables -t nat -A PREROUTING -p 51 -j DNAT –to 10.0.2.121
    2. Now we can proceed with the AWS VPN configuration. In AWS console, we go to VPC, VPN Connections – Customer Gateways and create Customer Gateway – the definition of the vCloud Director Org VDC Edge Gateway endpoint. We give it a name, set it to static routing and provide its public IP address (in my case the public address of the ADSL router).customer-gateway
    3. Next we define the other end of the tunnel – Virtual Private Gatway – in menu below. We give it a name and right after it is created, associate it with the VPC by right clicking on it.virtual-private-gateway
    4. Now we can create VPN Connection in the next menu below (VPN Connections). We give it a descriptive name and associate Virtual Private Gateway from step #3 with Customer Gateway from step #2. We select static routing and provide the subnet at the other end of the tunnel, which is in our case 192.168.100.0/24. This step might take some time to finish.
    5. When the VPN Connection is created we need to download its configuration. AWS will provide the configuration in various formats customized for the appliance on the other side of the tunnel. Generic format will do for our purposes. Needless to say, AWS does not allow custom setting of any of the given parameters – it is take it or leave it. download-configuration
    6. Before leaving AWS console we need to make sure that the subnet at the other side of the tunnel is propagated to the VPC routing table. This can be done in the Route Table menu, select the existing Route Table, in the Route Propagation tab find the Virtual Private Gateway from step #3 and check Propagate check box.route-table
    7. To configure the other side of the VPN endpoint – the Org VDC Edge Gateway we need to collect the following information from the configuration file obtained in the step #5.
      Virtual Private Gateway IP: 52.x.y.z
      Encryption Algorithm: AES-128
      Perfect Forward Secrecy: Diffie-Hellman Group 2
      Pre-Shared Key (PSK): 32 random characters
      MTU: 1436.
      Note: As was said before, none of these parameters can be changed on AWS side. So the router on the other side must support all of them. And here we hit a little issue. AWS pre-shared key is generated with number and letter (upper and lower case) characters and a special character – like dot, underscore, etc. Unfortunately vShield Edge does not support PSK with special character. NSX Edge does, but the legacy vCloud Director UI/API will not allow us to create IPsec VPN configuration with PSK containing special character. There are various ways how to solve it. One is not to use the native AWS VPN Gateway and instead use software VPN option, another is to create/edit the VPN configuration directly in NSX Manager (only Service Provider can do this) and lastly convert the Edge Gateway to Advanced Gateway and take advantage of the new networking UI and API that does not have this limitation (this functionality is currently available only on vCloud Air, but will soon be available to all vCloud Air Network providers).
    8. In vCloud Director UI go to Administration, select your Virtual Datacenter, Edge Gateways tab and right click on the correct Edge GW to select its Edge Gateway Services.edge-gw-services
    9. In The VPN tab Enable VPN by clicking the checkbox. In my NATed example I also had to configure public IP for the Edge GW (which is the address of the ADSL router).enable-vpn
    10. Finally we can create the VPN tunnel by clicking the Add button and selecting Establish VPN to a remote network pulldown option. Select local network(s) (192.168.110.0/24), in peer networks enter AWS VPC subnet (172.31.0.0/24), select internet interface of the Edge in the Local Endpoint, enter its IP address (10.0.2.121). For Peer ID and Peer IP use public address of Virtual Private Gateway from step #7. Change Encryption algorithm to AES and paste Shared Key (see the note in #7). Finally modify MTU size (1436).

If everything was set correctly then back in AWS console, under VPN Connections, Tunnel details we should see the tunnel status change to UP.

AWS offers two tunnel endpoints for redundancy, however in our case we are using only Tunnel 1.

tunnel-status-in-aws

If the firewall in Org VDC and Security Groups in AWS are properly set, we should be able to prove tunnel communication with pings from AWS instance to the Org VDC VM.

ping-test