Provider Networking in VMware Cloud Director

This is going to be a bit longer than usual and more of a summary / design option type blog post where I want to discuss provider networking in VMware Cloud Director (VCD). By provider networking I mean the part that must be set up by the service provider and that is then consumed by tenants through their Org VDC networking and Org VDC Edge Gateways.

With the introduction of NSX-T we also need to dive into the differences between NSX-V and NSX-T integration in VCD.

Note: The article is applicable to VMware Cloud Director 10.2 release. Each VCD release is adding new network related functionality.

Provider Virtual Datacenters

Provider Virtual Datacenter (PVDC) is the main object that provides compute, networking and storage resources for tenant Organization Virtual Datacenters (Org VDCs). When a PVDC is created it is backed by vSphere clusters that should be prepared for NSX-V or NSX-T. Also during the PVDC creation the service provider must select which Network Pool is going to be used – VXLAN backed (NSX-V) or Geneve backed (NSX-T). PVDC thus can be backed by either NSX-V or NSX-T, not both at the same time or none at all and the backing cannot be changed after the fact.

Network Pool

Speaking of Network Pools – they are used to create on-demand routed/isolated networks by tenants. The Network Pools are independent from PVDCs, can be shared across multiple PVDCs (of the same backing type). There is an option to automatically create VXLAN network pool with PVDC creation but I would recommend against using that as you lose the ability to manage the transport zone backing the pool on your own. VLAN backed network pool can still be created but can be used only in PVDC backed by NSX-V (same for very legacy port group backed network pool now available only via API). Individual Org VDCs can (optionally) override the Network Pool assigned of its parent PVDC.

External Networks

Deploying virtual machines without the ability to connect to them via network is not that usefull. External networks are VCD objects that allow the Org VDC Edge Gateways connect to and thus reach the outside world – internet, dedicated direct connections or provider’s service area. External network have associated one or more subnets and IP pools that VCD manages and uses them to allocate external IP addresses to connected Org VDC Edge Gateways.

There is a major difference how external networks are created for NSX-V backed PVDCs and for NSX-T ones.

Port Group Backed External Network

As the name suggest these networks are backed by an existing vCenter port group (or multiple port groups) that must be created upfront and is usually backed by VLAN (but could be a VXLAN port group as well). These external networks are (currently) supported only in NSX-V backed PVDCs. Org VDC Edge Gateway connected to this network is represented by NSX-V Edge Service Gateway (ESG) with uplink in this port group. The uplinks have assigned IP address(es) of the allocated external IPs.

Directly connected Org VDC network connected to the external network can also be created (only by the provider) and VMs connected to such network have uplink in the port group.

Tier-0 Router Backed External Network

These networks are backed by an existing NSX-T Tier-0 Gateway or Tier-0 VRF (note that if you import to VCD Tier-0 VRF you can no longer import its parent Tier-0 and vice versa). The Tier-0/VRF must be created upfront by the provider with correct uplinks and routing configuration.

Only Org VDC Edge Gateways from NSX-T backed PVDC can be connected to such external network and they are going to be backed by a Tier-1 Gateway. The Tier-1 – Tier-0/VRF transit network is autoplumbed by NSX-T using 100.64.0.0/16 subnet. The allocated external network IPs are not explicitly assigned to any Tier-1 interface. Instead when a service (NAT, VPN, Load Balancer) on the Org VDC Edge Gateway starts using assigned external address, it will be advertised by the Tier-1 GW to the linked Tier-0 GW.

There are two main design options for the Tier-0/VRF.

The recommended option is to configure BGP on the Tier-0/VRF uplinks with upstream physical routers. The uplinks are just redundant point-to-point transits. IPs assigned from any external network subnet will be automatically advertised (when used) via BGP upstream. When provider runs out of public IPs you just assign additional subnet. This makes this design very flexible, scalable and relatively simple.

Tier-0/VRF with BGP

An alternative is to use design that is similar to the NSX-V port group approach, where Tier-0 uplinks are directly connected to the external subnet port group. This can be useful when transitioning from NSX-V to T where there is a need to retain routability between NSX-V ESGs and NSX-T Tier-1 GWs on the same external network.

The picure below shows that the Tier-0/VRF has uplinks directly connected to the external network and a static route towards the internet. The Tier-0 will proxy ARP requests for external IPs that are allocated and used by connected Tier-1 GWs.

Tier-0 with Proxy ARP

The disadvantage of this option is that you waste public IP addresses for T0 uplink and router interfaces for each subnet you assign.

Note: Proxy ARP is supported only if the Tier-0/VRF is in Active/Standby mode.

Tenant Dedicated External Network

If the tenant requires direct link via MPLS or a similar technology this is accomplished by creating tenant dedicated external network. With NSX-V backed Org VDC this is represented by a dedicated VLAN backed port group, with NSX-T backed Org VDC it would be a dedicated Tier-0/VRF. Both will provide connectivity to the MPLS router. With NSX-V the ESG would run BGP, with NSX-T the BGP would have to be configured on the Tier-0. In VCD the NSX-T backed Org VDC Gateway can be explicitly enabled in the dedicated mode which gives the tenant (and also the provider) the ability to configure Tier-0 BGP.

There are seprate rights for BGP neighbor configuration and route advertisement so the provider can keep BGP neighbor configuration as provider managed setting.

Note that you can connect only one Org VDC Edge GW in the explicit dedicated mode. In case the tenant requires more Org VDC Edge GWs connected to the same (dedicated) Tier-0/VRF the provider will not enable the dedicated mode and instead will manage BGP directly in NSX-T (as a managed service).

Often used use case is when the provider directly connects Org VDC network to such dedicated external network without using Org VDC Edge GW. This is however currently not possible to do in NSX-T backed PVDC. There instead, you will have to import Org VDC network backed by NSX-T logical segment (overlay or VLAN).

Internet with MPLS

The last case I want to describe is when the tenant wants to access both Internet and MPLS via the same Org VDC Edge GW. In NSX-V backed Org VDC this is accomplished by attaching internet and dedicated external network portgroups to the ESG uplinks and leveraging static or dynamic routing there. In an NSX-T backed Org VDC the provider will have to provision Tier-0/VRF that has transit uplink both to MPLS and Internet. External (Internet) subnet will be assigned to this Tier-0/VRF with small IP Pool for IP allocation that should not clash with any other IP Pools.

If the tenant will have route advertisement right assigned then route filter should be set on the Tier-0/VRF uplinks to allow only the correct prefixes to be advertised towards the Internet or MPLS. The route filters can be done either in NSX-T direclty or in VCD (if the Tier-0 is explicitly dedicated).

The diagram below shows example of an Org VDC that has two Org VDC Edge GWs each having access to Internet and MPLS. Org VDC GW 1 is using static route to MPLS VPN B and also has MPLS transit network accessible as imported Org VDC network, while Org VDC GW 2 is using BGP to MPLS VPN A. Connectivity to the internet is provided by another layer of NSX-T Tier-0 GW which allows usage of overlay segmens as VRF uplinks and does not waste physical VLANs.

One comment on usage of NAT in such design. Usually the tenant wants to source NAT only towards the Internet but not to the MPLS. In NSX-V backed Org VDC Edge GW this is easily set on per uplink interface basis. However, that option is not possible on Tier-1 backed Org VDC Edge GW as it has only one transit towards Tier-0/VRF. Instead NO SNAT rule with destination must be used in conjunction with SNAT rule.

An example:

NO SNAT: internal 10.1.1.0/22 destination 10.1.0.0/16
SNAT: internal 10.1.1.0/22 translated 80.80.80.134

The above example will source NAT 10.1.1.0 network only to the internet.

25 thoughts on “Provider Networking in VMware Cloud Director

    1. The static route on Tier-0 GW (or VRF) has to be created by the service provider. Tenant can only select which prefixes should be advertised from Tier-1 to Tier-0 for routing (as opposed to NAT-routing)

      1. For Internet+MPLS (either static route or BGP), the service provider needs to have a dedicated Tier-0 GW (or VRF) per tenant, or can use shared Tier-0 GW (or VRF) among tenants? The service provider need to configure the static route at NSX-T manager instead of at VCD? Any roadmap to support VCD GUI for both service provider and tenant to configure the static route?

        1. Yes, for external network that provides internet with MPLS the T0/VRF must be dedicated. BGP can be configured from within VCD (if the Org VDC GW T1 is connected in dedicated mode), static route cannot.

  1. Tier-0 VRF apparently has to follow the same HA mode as its parent Tier-0. NAT, Load balancing, stateful firewall and VPN seem to be only supported in Active-standby HA mode. With this being said, i was wondering what a use case would be for Active-active T0 HA?

    Also, as i understand, an edge node like a compute node can connect to more than one transport zone. Can i therefore have a shared compute / edge node cluster as in NSX-V, or would i need a separate edge cluster on dedicated edge nodes?

    Thank you kindly.

    1. T0 A/A will provide more throughput (scales up to 8 nodes) and better availability.
      The 2nd question is not clear to me, Edge Node VM can live anywhere (it has its own N-VDS and TEPs). A transport node (host or Edge) can be part of only one overlay transport zone. But that has nothing to do with the Edge Node location.

  2. When I try and configure your last example there, where you’re up-linking a VRF backed Tier0 to another layer of Tier0’s with overlay backed segments, I”m unable to do so.

    When I try and create a link between the Tier0’s with an overlay segment, I get an error that says, “Provider interface in default tier0 /infra/tier-0s/_name-of-tier0-here should cover edge paths in VRF interfaces.”

    Have you run into that before? Are there any special considerations when building this type of configuration?

    Thanks!

      1. i got the same error. Does the last example (Internet+MPLS) support static route as default route of the shared Internet (80.80.80.0/24 in the example)? Does it only support BGP as default route of the shared Internet?

        1. While static route would work, the issue is you would have to set up the static route (for the return traffic) also on the internet router (in the example another Tier-0). That adds quite a lot of operational overhead.

          1. Does the migration tool supports the last example (Internet+MPLS)? We wanna migrate the (Shared Internet + MPLS) from NSX-v to NSX-T. Both the Internet and MPLS are the VCD external networks connected to NSX-v edge. The Internet is shared among tenants while MPLS is dedicated to a particular tenant.

          2. Yes. Migration tool does support migration of an VDC Gateway connected to multiple external networks. The NSX-T external network (backed by Tier-0/VRF) must be able to route to Internet/MPLS. And if you have suballocated IPs from different external networks on the V side you would have multiple subnets (and IP pools) on the T external network configured.

  3. Can you help if following configuration possible with either NSX-V or NSX-T options:
    a. vApp created with 10 VMs, each VM has only one vNIC interface since those VMs are part of K8s cluster as Master Node or as Worker Node
    b. A vApp network created internally
    c. Need to segregate egress traffic from worker nodes, Static routes will be configured in each VM to separate the OAM and SIG traffic
    d. Two Edge GW created for OAM and SIG to route the egress traffic
    e. The internal segment network shared by both Edge GW, Each Edge GW should have an interface in the internal subnet
    Looks it is not possible to share same segment with different EGW? What could be the way to achieve such configuration?

    1. vApp or Org VDC network can be connected to only one Edge GW. The alternative is to use a regular VM with two (or more) interfaces that will act as a router (VyOS for example).

  4. What if I would like to implement it with a parent t0 which has just two bgp peerings with the external router and using evpn address family?
    Using a vrf-t0 for each tenant and connecting the tenant edgegw(T1) to the vrf-t0.
    From what I can find it would be easy to give the tenant access to vrf routes but how to give them access public internet?

    1. You have three options:
      1. Each VRF T0 will have uplinks to internet. Not ideal as it defeats the simplicity of EVPN.
      2. Implement internet on the physical VRFs
      3. Wait until NSX-T supports VRF route imports/route leaking with Internet VRF.

  5. Tomas,
    I’m having an issue in VCD backed by NSX-T. We are wanting to use the “multi tenant” functions and tie lots of customers to the same T0-GW. We are adding a public /28 to the VCD External networks hoping to assign each customer a few ips. However in VCD if we add an external network to a tenant we cant use that external network again for another tenant. We also are unable to add multiple external networks to a T0-GW. So we are finding that we have to create at least 1 T0 per customer or a VRF on the T0 for each customer. Are these really the only options or am I missing something?
    Thanks
    Josh

    1. I failed to mention, We are wanting each tenant to have their own T1-GW that is tied to one main T0-GW.

      1. Tier-0 backed external network can have multiple subnets, each with their own IP Pools. You can allocate IPs from these pools to tenant Org VDC Gateways (Tier-1s) connecting to this external network. You can connect multiple T1s to T0 as long as you do not select the “Dedicate external network” button.

  6. Hey Tomas, would this be possible, from the perspective of EVPN
    VM (windows/linux server) -> segment with a gw for a VM -> T1-vrf-bla -> T0-vrf-bla – T0 -> MPLS router vrf bla

    The only bgp peering would be from T0 to MPLS router.
    I’ve been trying this setup but not able to leak the def route from MPLS vrf back to the T0-vrf-bla
    I can get the VM prefix to MPLS router, but I can’t get the def route from MPLS vrf back to NsxT.

    Cheers

  7. Hey Tomas, thanks for this.
    I did go over this doc, but here it’s saying you need a vRouter (third party virtual router) that’s gonna be connected to the T0-vrf-bla.
    I do not want this šŸ™‚ whats the point of EVPN then..
    And i did try with import/export communities, but I’m still unable to import routes. btw, I’m stitching RTs on the core for the import/export

    1. EVPN provides connectivity from T0 VRFs to Datacenter Gateway VRFs via single MP BGP control plane connection and VXLAN dataplane encapsulation. It is still 1:1 (T0 VRF to DC VRF).
      Point of EVPN is to simplify the T0 VRF set up as you do not need to create uplinks, BGP sessions and dedicated transits for each VRF.

      1. Hey Tomas, so at the end, I’ve managed to set up the hole NsxT – MPLS communication. As it seems, there aren’t many articles about ISP integration with the NsxT… Well, the part that we needed šŸ™‚
        As we’re an MSP, we’re relying heavily on the route leaking. Not ideal, but it works.
        A few details were missing from the reading that I’ve done for the past few months
        1. NsxT DCI router(s) needs to be inline + its would be advisable to be with a RR functionality
        2. Configuring route stitching is a pain in the ass, especially for the l3vpn plumbing.
        3. NsxT EVPN configuration is sooooo simple
        4. Cisco EVPN configuration is a proper crap. Scaling this is a nightmare…

        All in all, I’m very pleased with NsxT EVPN and even more happier with gateway migration (from the core over to the NsxT).

        Cheers

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.