Org VDC Edge Gateway CPU/RAM Reservations

vCloud Director 8.20 allows deployment of Org VDC Edge Gateways in 4 different form factors from Compact to X-Large where each provides different level of performance and consumes different amount of resources.

As these Edge Gateways are deployed by NSX Manager which allows setting custom reservations for CPU and RAM via an API call PUT https://<NSXManager>/api/4.0/edgePublish/tuningConfiguration, it is also possible in vCloud Director to set custom reservations.

Why would you change the default reservations? Reservations at VM (Edge) level reserve the resources for itself which means no other VM can utilize them in case they are unused. They basically guarantee certain level of service that the VM (Edge) from performance perspective will always deliver. In service provider environments oversubscription provides ROI benefits and if the service provider can guarantee enough resources at cluster scale, than the VM level reservations can be set lower if at all.

This can be accomplished by tuning the networking.gatewayMemoryReservationMultiplier and networking.gatewayCpuReservationMultiplier settings via cell-management-tool from vCloud Director cell. By default the CPU multiplier is set to 64 MHz per vCPU and the Memory multiplier to 0.5.

By default Edge Gateways will be deployed with the following reservation settings:

Org VDC Edge GW Default Resource Reservations

The following command will change memory multiplier to 10%:

/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n networking.gatewayMemoryReservationMultiplier -v 0.1

Note: The new reservation settings are applicable only for newly deployed Org VDC Edge Gateways. Redeploying existing edges will not change their reservation settings. You must either use NSX API to do so, or modify Org VDC Edge Gateway form factor (e.g. change Large to Compact and then back to Large) which is not so elegant as it will basically redeploy the Edge twice.

Also note that NSX 6.2 and NSX 6.3 have different sizing of Quad Large Edge. vCloud Director 8.20 is by default set for the NSX 6.3 size which is 2 GB RAM (as opposed to NSX 6.2 value of 1 GB RAM). It is possible to change the default for the reservation calculation by editing networking.full4GatewayMemoryMb setting to value ‘1024’

Setup Site-to-Site VPN between Azure and vCloud Director

My previous blog post was about setting up IPSec VPN tunnel between AWS VPC and vCloud Director Org VDC. This time I will describe how to achieve the same with Microsoft Azure.

vCloud Director is not among Azure list of supported IPSec VPN endpoints however it is possible to set up such VPN although it is not straightforward.

I will describe the setup of both Azure and VCD endpoints very briefly as it is very similar to the one I described in my previous article.

Azure Configuration

  • Resource Group (logical container object) – in my example RG UK
  • Virtual network (large address space similar to AWS VPN subnet) – 172.30.0.0/16
  • Subnets – at least one for VMs (172.30.0.0/24) and one for Gateway (172.30.255.0/29)
  • Virtual Network Gateway – Azure VPN endpoint with public IP address associated with the Gateway subnet above. Gateway type is VPN, VPN type is Policy-based (this is because Route-based type uses IKE2 which is not supported by NSX platform used by vCloud Director).
  • Local Network Gateway – vCloud VPN endpoint definition with its public IP address and subnets that should be reachable behind the vCloud VPN endpoint (81.x.x.x, 192.168.100.0/24)
  • Connection – definition of the tunnel:
    • Connection type: Site-to-site (IPSec)
    • Virtual network gateway and local network gateway are straightforward (those created previously)
    • Connection name: whatever
    • Shared Key (PSK): create your own 32+ character key using upper and lower case characters and numbers
  • Test VM connected to the VM subnet (IP 172.30.0.4)

azure-resources

vCloud Configuration

As explained above we created Policy Based VPN endpoint in Azure. Policy Based VPN uses IKE version 1, Diffie-Hellman Group 2 and no Perfect Forward Secrecy.

However selection of DH group and PFS is not available to tenant in vCloud Director on the legacy Org VDC Edge Gateway. Therefore the following workaround is proposed:

Tenant configures VPN on his Org VDC Edge Gateway with the following:

  • Name: Azure
  • Enable this VPN configuration
  • Establisth VPN to: a remote network
  • Local Networks: 192.168.100.0/24 (Org VDC network(s))
  • Peer Networks: 172.30.0.0/24
  • Local Endpoint: Internet (interface facing internet)
  • Local ID: 10.0.2.121 (Org VDC Edge GW internet interface)
  • Peer ID: 51.x.x.x (public IP of the Azure Virtual network gateway)
  • Peer IP: 51.x.x.x (same as previous)
  • Encryption protocol: AES256
  • Shared Key: the same as in Azure Connection definition

Now we need to ask the service provider to directly in NSX in the Edge VPN configuration disable PFS and change DH Group to DH2.

nsx-vpn

Note that this workaround is not necessary on Org VDC Edge Gateway that has been enabled with Advanced Networking services. This feature is at the moment only in vCloud Air, however soon will be available to all vCloud Air Network service providers.

If all firewall rules are properly set up we should be able to ping between Azure and vCloud VMs.

ping

Setup Site-to-Site VPN between AWS and vCloud Director

In today’s reality of multi cloud world customers are asking how to set up connection between them. In this article I am going to demonstrate how to set up IPsec VPN tunnel between AWS VPC and vCloud Director Org VDC.

IPSec is standard protocol suite which works at OSI Layer 3 and allows encrypting IP packet communication. It is supported by many software, hardware and cloud vendor implementations, however it is also quite complex to set up due to large sets of different settings which both tunnel endpoints must support. Additionally as it does not rely on TCP L4 protocol NAT traversal can be a challenge.

In my example I am using my home lab vCloud Director instance running behind NATed internet connection. So what could go wrong 🙂

The diagram below shows the set up.

 

AWS Virtual Private Cloud on the left is created with large subnet 172.31.0.0/16, a few instances, and Internet and VPN gateways.

On the right is vCloud Director Org VDC with a network 192.168.100.0/24 behind an Org VDC Edge Gateway which is connected to the Internet via my home ADSL router.

    1. We start by taking care of IPSec NAT traversal over the ADSL router. As I have dd-wrt OS on the router, I am showing how I enabled port forwarding of UDP ports 500 and 4500 to the Edge GW IP 10.0.2.121 and added DNAT for protocols 50 (AH) and 51 (ESP) to the router startup script.
      udp-port-forwardingiptables -t nat -A PREROUTING -p 50 -j DNAT –to 10.0.2.121
      iptables -t nat -A PREROUTING -p 51 -j DNAT –to 10.0.2.121
    2. Now we can proceed with the AWS VPN configuration. In AWS console, we go to VPC, VPN Connections – Customer Gateways and create Customer Gateway – the definition of the vCloud Director Org VDC Edge Gateway endpoint. We give it a name, set it to static routing and provide its public IP address (in my case the public address of the ADSL router).customer-gateway
    3. Next we define the other end of the tunnel – Virtual Private Gatway – in menu below. We give it a name and right after it is created, associate it with the VPC by right clicking on it.virtual-private-gateway
    4. Now we can create VPN Connection in the next menu below (VPN Connections). We give it a descriptive name and associate Virtual Private Gateway from step #3 with Customer Gateway from step #2. We select static routing and provide the subnet at the other end of the tunnel, which is in our case 192.168.100.0/24. This step might take some time to finish.
    5. When the VPN Connection is created we need to download its configuration. AWS will provide the configuration in various formats customized for the appliance on the other side of the tunnel. Generic format will do for our purposes. Needless to say, AWS does not allow custom setting of any of the given parameters – it is take it or leave it. download-configuration
    6. Before leaving AWS console we need to make sure that the subnet at the other side of the tunnel is propagated to the VPC routing table. This can be done in the Route Table menu, select the existing Route Table, in the Route Propagation tab find the Virtual Private Gateway from step #3 and check Propagate check box.route-table
    7. To configure the other side of the VPN endpoint – the Org VDC Edge Gateway we need to collect the following information from the configuration file obtained in the step #5.
      Virtual Private Gateway IP: 52.x.y.z
      Encryption Algorithm: AES-128
      Perfect Forward Secrecy: Diffie-Hellman Group 2
      Pre-Shared Key (PSK): 32 random characters
      MTU: 1436.
      Note: As was said before, none of these parameters can be changed on AWS side. So the router on the other side must support all of them. And here we hit a little issue. AWS pre-shared key is generated with number and letter (upper and lower case) characters and a special character – like dot, underscore, etc. Unfortunately vShield Edge does not support PSK with special character. NSX Edge does, but the legacy vCloud Director UI/API will not allow us to create IPsec VPN configuration with PSK containing special character. There are various ways how to solve it. One is not to use the native AWS VPN Gateway and instead use software VPN option, another is to create/edit the VPN configuration directly in NSX Manager (only Service Provider can do this) and lastly convert the Edge Gateway to Advanced Gateway and take advantage of the new networking UI and API that does not have this limitation (this functionality is currently available only on vCloud Air, but will soon be available to all vCloud Air Network providers).
    8. In vCloud Director UI go to Administration, select your Virtual Datacenter, Edge Gateways tab and right click on the correct Edge GW to select its Edge Gateway Services.edge-gw-services
    9. In The VPN tab Enable VPN by clicking the checkbox. In my NATed example I also had to configure public IP for the Edge GW (which is the address of the ADSL router).enable-vpn
    10. Finally we can create the VPN tunnel by clicking the Add button and selecting Establish VPN to a remote network pulldown option. Select local network(s) (192.168.110.0/24), in peer networks enter AWS VPC subnet (172.31.0.0/24), select internet interface of the Edge in the Local Endpoint, enter its IP address (10.0.2.121). For Peer ID and Peer IP use public address of Virtual Private Gateway from step #7. Change Encryption algorithm to AES and paste Shared Key (see the note in #7). Finally modify MTU size (1436).

If everything was set correctly then back in AWS console, under VPN Connections, Tunnel details we should see the tunnel status change to UP.

AWS offers two tunnel endpoints for redundancy, however in our case we are using only Tunnel 1.

tunnel-status-in-aws

If the firewall in Org VDC and Security Groups in AWS are properly set, we should be able to prove tunnel communication with pings from AWS instance to the Org VDC VM.

ping-test

Edge Gateway Deployment Speed in vCloud Director 8.10

Edge GatewayIn vCloud Director 8.10 there is massive improvement in deployment (and configuration) speed of Edge Gateways. This is especially noticeable in use cases where large number of routed vApps are provisioned in as short time as possible – for example nightly builds for testing, or labs for training purposes. But this is also important for customer onboarding – time to login to cloud VM from the swipe of the credit card SLA.

Theory

How is the speed improvement achieved? It is actually not really vCloud Director accomplishment. The deployment and configuration of Edge Gateways were always done by vShield or NSX Manager. However, there is a big difference how vShield Manager and NSX Manager communicate with the Edge Gateway to push its configuration (IP addresses, NAT, firewall and other network services configurations).

As the Edge Gateway can be deployed to any network which can be completely isolated from any external traffic, its configuration cannot be done over the network and instead out-of-band communication channel must be used. vShield Manager always used VIX API (Guest Operations API) which involves communication with vCenter Server, hostd process on ESXi host hosting the Edge Gateway VM and finally VMware Tools running in the Edge Gateway VM (see this older post for more detail).

NSX Manager uses different mechanism. As long as the ESXi host is properly prepared for NSX, message bus communication between the NSX Manager and vsfwd user space process on the ESXi host is established. Additionally the configuration to the Edge Gateway VM is done via VMCI channel.

Prerequisites

There are necessary prerequisites to use the faster message bus communication as opposed to VIX API. If any of these is not fulfilled the communication mechanism fails back to VIX API.

  • The host running the Edge Gateway must be prepared for NSX. So if you are in vCloud Director using solely VLAN (or even VCDNI) backed network pools and you skipped the NSX preparation of underlying clusters, message bus communication cannot be used as the host is missing the NSX VIBs and vsfwd process.
  • The Edge Gateway must be version 6.x. It cannot be the legacy Edge version 5.5 deployed by older vCloud Director releases (8.0, 5.6, etc.). vCloud Director 8.10 deploys Edge Gateway version 6.x however existing Edges deployed before upgrade to 8.10 must be redeployed in vCloud Director or upgraded in NSX (read this whitepaper for a script to do it at once).
  • Obviously NSX Manager must be used (as opposed to vShield Manager) – anyway vCloud Networking and Security is not supported with vCloud Director 8.10 anymore.

Performance Testing

I have done quick proof of concept testing to see what is the relative improvement between the older and newer deployment mechanism.

I used 3 different combinations of the same environment (I was upgrading from one combination to the other).

  • vCloud Director 5.6.5 + vCloud Networking and Security 5.5.4
  • vCloud Director 8.0.1 + NSX 6.2.3 (uses legacy Edges)
  • vCloud Director 8.10 + NSX 6.2.3 (uses NSX Edges)

All 3 combinations used the same hardware and the same vSphere environment (5.5) with nested ESXi hosts. So the point is to look at the relative differences as opposed to absolute deployment times.

I measured in PowerCLI sequential deployment speed of 10 vApps with one isolated network and 10 vApps with one routed network with multiple runs to calculate average per one vApp. The first scenario was to measure differences in provisioning speeds of VXLAN logical switches to see impact of controller based control plane mode. The second includes provisioning of an Edge Gateway and logical switch. The vApps were otherwise empty (no VMs).

Note; If you want to do similar test in your environment, I captured the two empty vApps with only the routed or isolated networks to a catalog with vCloud API (PowerCLI) as it cannot be done from vCloud UI.

Here are the average deployment times of each vApp.

vCloud Director 5.6.5 + vCloud Networking and Security 5.5.4

  • Isolated 5-5.5 seconds
  • Routed 2:17 min

vCloud Director 8.0.1 + NSX 6.2.3

  • Isolated cca 6.8 seconds (Multicast), 7.5 seconds (Unicast)
  • Routed 2:20 min

vCloud Director 8.10 + NSX 6.2.3

  • Isolated 7.7 s (Multicast), 8.1 s (Unicast)
  • Routed 1:35 min

While the speed of logical switch provisioning goes little bit down with NSX and with Unicast control plane mode, the Edge Gateway deployment gets massive boost with NSX and VCD 8.10. While the OVF deployment of NSX Edge takes little bit longer (from 20 to 30 s) it is the configuration that makes up for it (from way over a minute down to about 30 s).

Just for comparison here are the tasks done during deployment of each routed vApp as reported by vSphere Client Recent Task window.

vCloud Director 5.6.5 + vCloud Networking and Security
vCloud Director 5.6.5 + vCloud Networking and Security
vCloud Director 8.10 + NSX 6.2.3
vCloud Director 8.10 + NSX 6.2.3

Layer 2 VPN to the Cloud

When VMware NSX 6.0 came out about more than one year ago, one of the new great features it had on top of the its predecessor VMware vCloud Network and Security (vCNS) was L2VPN service on Edge Service Gateway which allows stretching layer 2 network segments between distant sites in different management domains. NSX 6.1 further enhanced the functionality by introducing Standalone Edge which can be deployed on top of vSphere without an NSX license and acts as L2VPN client.

Many vCloud Service Providers are now deploying their public cloud with vCloud Director and NSX instead of vCNS so I am often asked how could they leverage NSX in order to provide L2VPN service to their tenants.

As of today neither vCloud Director 5.6 nor 8.0 (in beta at the moment) can deploy NSX Edges and manage the L2VPN service. However it is still possible for the SP to provide L2VPN as a managed service for his tenants.

Let me demonstrate how would it work on the following artificial example.

The customer has an application that resides on 3 different VLAN based networks (subnet A, B and C) routed with existing physical router. He would like to extend subnets A and B into the cloud and deploy VMs there. The VMs in the cloud should access the internet in order to connect to external SaaS services through the provider connection (egress optimization) but should still be able to reach database running on subnet C which is hosted on premises.

The diagram below shows the whole architecture (click for larger version):

L2VPN to the Cloud

On the left is the customer on premises datacenter with physical router and three VLAN based networks. On the right is the public cloud with NSX design I proposed in one of my previous blog articles. While the unimportant parts are grayed out what is important is how the customer Org VDC and the NSX Edge Gateways is deployed.

  • The provider deploys tenant dedicated NSX Edge Service Gateway outside of vCloud Director manually and configures it based on customer requirements. The provider creates two logical switches (VXLAN 5005 and 5006) which will be used for extending customer subnets A and B. The switches are trunked to the NSX Edge and the Edge interface IP addresses are assigned identical to the IP addresses of the physical router on-premises (a.a.a.1 and b.b.b.1).
  • The two logical switches are configured in vCloud Director as External Networks with the proper subnets A and B and pool of unused IPs.
  • Two Org VDC networks are created inside tenant’s Org VDC as directly connected to the two External Networks.
  • L2VPN server is configured on the NSX Edge (with encryption algorithm, secret, certificate and stretched interfaces). Also Egress Optimization Gateway Address is configured (both physical gateway IPs are entered – a.a.a.1 and b.b.b.1). This will filter ARP replies of the two gateways sharing the same IPs in the tunnel and allow NSX Edge to act as the gateway to the internet.
  • The tenant will install Standalone Edge which is distributed as OVF inside his datacenter and set it up: he must map VLANs to tunnel IDs supplied by the provider, configure Edge Server public IP and port and encryption details.

Now what about the subnet C? How can VMs deployed in the cloud get to it if the physical router is unreachable due to the enabled egress optimization? Following trick is used:

  • Another subnet z.z.z.0/30 is used for P2P connection between the NSX Edge in the cloud and the physical router.
  • IP address z.z.z.1/30 is configured on the physical router on one of the stretched subnets (e.g. A).
  • The second IP z.z.z.2/30 is configured on the NSX Edge on the same subnet.
  • Finally static route is created on the NSX Edge pointing subnet C to the next hop address z.z.z.1.

Some additional considerations:

  • In case the tenant has licensed NSX on-premises he can obviously use ‘full’ NSX Edge Service Gateway. The advantages are that it is much easier to deploy and configure. It can also stretch VXLAN based networks as opposed to only VLANs which are supported by Standalone Edge.
  • Standalone Edge can be connected either to Standard Switch or Distributed Switch. When Standard Switch is used promiscuous mode and forged transmits must be enabled on the trunk port group. VLANs ID 4095 (all) must be configured to pass multiple VLANs.
  • When Distributed Switch is used it is recommended to use Sink Port instead of promiscuous mode. Sink Port receives traffic with MAC addresses unknown to vDS.
  • Sink Port creation is described here. It requires vDS edit via vCenter Managed Object UI. While Sink Port can be also created with net-dvs –EnableSink command directly on the ESXi host with Standalone Edge VM running it is not recommended as the host configuration can be overridden by vCenter Server.
  • RC4-MD4 encryption cipher should not be used as it is insecure and has been deprecated in NSX 6.2.