In my previous article vCloud Director with NSX: Edge Cluster I described various design options of NSX Edge Cluster in vCloud Director environment. In this article I would like to discuss additional option which extends the Design Option III – Dedicated Edge Cluster. Below is the picture showing the scenario from the previous post.
There is one Provider deployed Edge in the Edge Cluster for each Transit vCloud Director External network to which Org VDC Edge Gateways are connected to. The option works quite well for use cases where the Provider Edge is dedicated to single tenant – e.g. it is providing VPN services or L2 bridging. (Note that in L2 bridging use case the Org VDC Edge Gateway is not deployed and Org VDC networks connect directly to tenant dedicated external network).
However when we want to provide access to a shared service (for example internet) where we will deploy multiple Org VDC Edge Gateways of different tenants connected to the same external network they will all have to go through a single Provider Edge which can become a bottleneck.
As of NSX version 6.1 Edge Gateways can however be deployed in ECMP (Equal Cost Multi-Path) configuration where we can aggregate bandwidth of up to 8 Edges (8x10GB = 80 GB througput). High availability of ECMP Edges is then achieved with dynamic routing protocol (BGP or OSPF) with aggressive timing for short failover times (3 seconds) which will quickly remove failed path from the routing tables.
The problem is that (as of vCloud Director 5.6) Organization VDC Edges are deployed in the legacy (vShield/vCNS) mode and do not support ECMP routing nor dynamic routing protocols. The design I propose will get around this limitation by deploying Distributed Logical Router between Provider and Organization VDC Edges.
The picture above shows two Provider ECMP Edges (can scale up to 8) with two physical VLAN connections each to upstream physical router and one internal interface to the Transit Edge logical switch. Distributed Logical Router (DLR) then connects the Transit Edge logical switch with the Transit vCloud Director External Network to which all tenant Org VDC Edge Gateways are connected to. The DLR has ECMP routing enabled as well as OSPF or BGP dynamic routing peering with the Provider Edges. The DLR will provide two (or more) equal paths to upstream Provider Edges and will choose one based on hashing algorithm of source and destination IP of the routed packet.
The two shown Org VDC Edge Gateways (which can belong to two different tenants) then will take advantage of all the bandwidth provided by the Edge Cluster (indicated with the orange arrows).
The picture also depicts the DLR Control VM. This is the protocol endpoint which peers with Provider Edges and learns and announces routes. These are then distributed to ESXi host vmkernel routing process by the NSX Controller Cluster (not shown in the picture). The failure of DLR Control VM has impact on routing information learned via OSPF/BGP protocol even if DLR is highly available in active standby configuration due to the protocol aggressive timers (DLR control VM failover takes more than 3 seconds). Therefore we will create static route on all ECMP Provider Edges for the Transit vCloud Director External network subnet. That is enough for north – south routing as Org VDC subnets are always NATed by the tenant Org VDC Edge Gateway. South – north routing is static as the Org VDC Edge Gateways are configured with default gateway defined in the External Network properties.
The other consideration is placement of DLR Control VM. If it fails together with one of ECMP Provider Edges the ESXi host vmkernel routes are not updated until DLR Control VM functionality fails over to the passive instance and meanwhile route to the dead Provider Edge is black holing traffic. If we have enough hosts in the Edge Cluster we should deploy DLR Control VMs with anti-affinity to all ECMP Edges. Most likely we will not have enough hosts therefore we would deployed DLR Control VMs to one of the compute clusters. The VMs are very small (512 MB, 1 vCPU) therefore the cluster capacity impact is negligible
17 thoughts on “vCloud Director with NSX: Edge Cluster (Part 2)”
Good summary Tomas. Well thought.
Great stuff there !! But which IP addressing are you using for subnets on transit edge network (vxlan5001) and transit / vcd external network (vxlan5002)? Are you still providing public IPs to customer without any modification on Provider Edges? Or maybe you’re natting on every level.. Thanks for you help!
You would use public IPs on the vCloud External networks (VXLAN 5002) so the tenants can use them for their applications and also set up IPsec VPN if they want to.
Now the transit Edge network and the ECMP Edges can use private IPs (to save public address space) as long as the green physical routers have public IPs. So the traceroute from an internet client would look like this: ….. > public physical green router IP > private Provider ECMP Edge IP > private DLR IP > public tenant Edge IP.
Could you create the same type of architecture but use the Universal DLR and Logical Switches within NSX 6.2?
In this setup I assume the default gateway of whatever public IP block you choose would be placed on the Green Routers? This would let OSPF or BGB define the route between the vCloud External Network and the gateway correct?
Yes you can use universal DLR and LS for both transit networks. The VCD external networks would map to respective port groups across multiple VCs of LS 5002. In case of public cloud you would use public IPs not only on green routers but also on VCD external network(s) so the tenant Edges would be visible from the internet.
So a block of public IP’s would be assigned to the vCloud External network (vxlan5002) With the default gateway of the LS anchored on the DLR and each Org-VDC Edge pulling the next available IP. The Uplinks between the DLR and ECMP Edges would be private IP space (vxlan5001) and a new set of public IP’s on vlan101 and vlan102 connecting the TOR switches to the ECMP array unless you were doing some form of Natting between the TOR switches and the ECMP edge array.
Does that sound about right?
In your example how are the provider edges deployed in DCMP configuration? via NSX or from VCD?
If it is done from NSX, technically we need not bother about the edges and just assign the LNI 5002 to VCD and external network and move forward?
If it is done from VCD, how do you make sure the deployed Edge is placed correctly in Edge cluster, given we present only the compute cluster in VCD?
This is an awesome post by the way and my telco cloud deployment scenario looks exactly like this and I just need to iron out this finer details.
Provider Edges are deployed outside VCD. VCD at the moment cannot deploy NSX Edges (only legacy Edges that do not support dynamic routing).
I do not follow your remark about VNI 5002. It is an VCD external network. And you need Edges to route between VXLAN and VLAN.
So your Org Gateways are essentially your tenants entry and exit points. Doesn’t this design not take advantage of the DLR by forcing it through an Edge?
Correct, vCloud Director cannot manage and expose DLR to tenants.
Is this still applicable Since vCD 8.20 is out which exposes advance network capacities. Can we make use of DLR for tenant in vCD 8.20. Any suggestion/guidance would be helpful.
DLR is not exposed to tenants in 8.20.
This sure looks like a well thought out design. But what I’m wondering is which NSX edition is a minimum requirement. I don’t see any microsegmentation or hardware vtep features being used in this design. So maybe even a NSX-SP base would be sufficient. Still waiting on details about the SP editions in the vCAN program though. Should hear something in Q1 2017.
Depending which design option you will use. Both DLR and dynamic routing protocols require NSX Advanced edition.
Tom, great article and you have a great Blog Here.
I have a design question and it’s kind of an “out of the box”/thinking type question.
In the Edge Cluster where the WAN (ISP) Connections come in to a service provider’s physical Data Center. Could the physical routers be removed and just terminate the ISP connections directly to the Provider Edges (orange in color from the diagram)? Then creating a BGP peering with the ISP from those SP Edges directly? Thus avoiding the hardware purchase, support and management costs of physical routers. If this is not a good idea maybe you can fill us in. Are there some limitations of the Service Provider Edges in NSX where physical routers are still a preferred design?
I have my vApp created with Org VDC Network. All VMs of vApp are connected to Org VDC network (192.168.100.x)
Edge is deployed which is connected to External Network (10.x.x.x). Edge interfaces have 192.168.100.1 IP and the external IPs list (10.x.x.x) configured.
I have fenced the vApp Org VDC network, created SNAT rules to translate the Org VDC IP to External Network IP. But I am still not able to connect to Internet from my vApp.