vCloud Architecture Toolkit for Service Provider Update

The vCloud Architecture Toolkit for Service Provider website has been updated with new set of documents. All documents were re-branded with the new VMware Cloud Provider Program logos that replace the old vCloud Air Network brand.

My Architecting a VMware vCloud Director Solution for VMware Cloud Providers whitepaper has been refreshed to include vCloud Director 8.10 and 8.20 additions that were missing in the previous version. The current version of the document is 2.8 with August 2017 release date.

Here is summary of the new or updated topics:

  • Cell sizing
  • vCloud DB performance tips
  • New vCenter Chargeback Manager network metrics
  • vRealize Business for Cloud
  • vRealize Log Insight
  • vRealize Operations Manager
  • NSX Networking updates
  • Storage support
  • vCloud RBAC
  • Org VDC vSphere Resource Settings
  • VCDNI deprecation
  • New Org VDC Edge GW features
  • Distributed Firewall
  • VM Auto import
  • vCloud API for NSX
  • vCloud Director orchestrated upgrade

The document can be downloaded in PDF format or viewed online.

Advertisements

vRealize Operations Management Pack for NSX-V and Log Insight Integration

Quick post about an issue I discovered in my lab during upgrade to NSX 6.3.3. This particular NSX version has a silent new feature that verifies if syslog configuration on Edges is correct. If the syslog entry is incorrect (it is not an IP address or FQDN with at least one dot character or does not have TCP/UDP protocol specified) it will not let you save it. This however also means that older Edges (with version 6.3.2 or older) that have incorrect syslog setting will fail to be upgraded as the incorrect config will not be accepted.

So how does it relate to the title of the article? If you have vROps in your environment with NSX-V management pack and you have enabled Log Insight integration, the Management Pack will configure syslog on all NSX components. Unfortunately in my case it configures them incorrectly with only hostname and no protocol. This reconfiguration happens roughly every hour. This might be especially annoying in vCloud Director environment where all the Edges are initially deployed with syslog setting specified by VCD, but then are changed within an hour by vROps to something different.

Anyway, the remediation is simple. Disable the Log Insight integration of the vROps NSX Management Pack as shown on the picture below.

Org VDC Edge Gateway CPU/RAM Reservations

vCloud Director 8.20 allows deployment of Org VDC Edge Gateways in 4 different form factors from Compact to X-Large where each provides different level of performance and consumes different amount of resources.

As these Edge Gateways are deployed by NSX Manager which allows setting custom reservations for CPU and RAM via an API call PUT https://<NSXManager>/api/4.0/edgePublish/tuningConfiguration, it is also possible in vCloud Director to set custom reservations.

Why would you change the default reservations? Reservations at VM (Edge) level reserve the resources for itself which means no other VM can utilize them in case they are unused. They basically guarantee certain level of service that the VM (Edge) from performance perspective will always deliver. In service provider environments oversubscription provides ROI benefits and if the service provider can guarantee enough resources at cluster scale, than the VM level reservations can be set lower if at all.

This can be accomplished by tuning the networking.gatewayMemoryReservationMultiplier and networking.gatewayCpuReservationMultiplier settings via cell-management-tool from vCloud Director cell. By default the CPU multiplier is set to 64 MHz per vCPU and the Memory multiplier to 0.5.

By default Edge Gateways will be deployed with the following reservation settings:

Org VDC Edge GW Default Resource Reservations

The following command will change memory multiplier to 10%:

/opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n networking.gatewayMemoryReservationMultiplier -v 0.1

Note: The new reservation settings are applicable only for newly deployed Org VDC Edge Gateways. Redeploying existing edges will not change their reservation settings. You must either use NSX API to do so, or modify Org VDC Edge Gateway form factor (e.g. change Large to Compact and then back to Large) which is not so elegant as it will basically redeploy the Edge twice.

Also note that NSX 6.2 and NSX 6.3 have different sizing of Quad Large Edge. vCloud Director 8.20 is by default set for the NSX 6.3 size which is 2 GB RAM (as opposed to NSX 6.2 value of 1 GB RAM). It is possible to change the default for the reservation calculation by editing networking.full4GatewayMemoryMb setting to value ‘1024’

vCloud Director 8.20: Distributed Firewall

NSX Distributed Firewall (DFW) is the most popular feature of NSX which enables microsegmenation of networks with vNIC level firewalls in hypervisor. For real technical deep dive into the feature I recommend reading Wade Holmes free e-book available here.

vCloud Director 8.20 provides this feature to tenants with brand new HTML5 UI and API. It is managed at Org VDC level from Manage Firewall link. This opens new tab with the new user interface.

manage-firewall

dfw-ui

Firewall Comparison

vCloud Director now offers three different firewalls types for tenants, which might be confusing. So let me quickly compare them.

firewall-comparison

The picture above shows two Org VDCs each with different network topologies. Org VDC 1 is using Org VDC Edge Gateway that provides firewalling as well as other networking services (load balancing, VPNs, NAT, routing, etc.). It has also brand new UI and Network API. Firewalling at this level is enforced only on packets routed through the Edge Gateway.

One level below we see vApps with vApp Edges. These provide routing, firewalling and NAT between routed vApp Network and Org VDC network. There is no change in firewall capability of vApp Edge in vCloud Director 8.20 and old flash UI and vCloud API can be used for its configuration. Firewalling at vApp Edge level is enforced only on packets routed between Org VDC and vApp networks.

Distributed firewall is applied at the vNIC level of virtual machines. It means it can inspect every packet and frame coming and leaving VM and is therefore completely independent from the network topology and can be used for microsegmentation of layer 2 network. Both layer 3 and layer 2 rules can be created.

Obviously all three firewall types can be combined and used together.

Managing Access to Distributed Firewall

There are four new access rights related to DFW in vCloud Director.

  • Manage Firewall
  • Configure Distributed Firewall Rules
  • View Distributed Firewall Rules
  • Enable / Disable Distributed Firewall

The last right is by default available only to system administrators, therefore the provider can control which tenant can and cannot use DFW and it can thus be offered as a value added service. The provider can either enable DFW selectively for specific Org VDCs or alternatively grant Enable/Disable Distributed Firewall right to a specific organization via API. The tenant can enable DFW by himself.

Distributed Firewall under the Hood

Each tenant is given a section in the NSX firewall table and can only apply rules to VMs and Edge Gateways in his domain. There is one section for each Org VDC that has DFW enabled and it is created always on top.

Edit 3/14/2017: In fact it is possible to create the section at the bottom just above the default section. This allows provider to create its own section on the top which will be always enforced first. The use case for this could be service network.

To force creation of the section at the bottom the firewall must be enabled with API call with ?append=true at the end.

Example: 

POST https://vcloud.fojta.com/network/firewall/vdc/be0f2baa-d36f-47f0-8443-3c5cac231ba5?append=true

Org VDC Section Appended at the Bottom

As tenants could have overlapping IPs all rules in the section are scoped to a security group with dynamic membership of tenant Org VDC resource pools and thus will be applied only to VMs in the Org VDC.

nsx-dfw-section
Org VDC section in NSX DFW
org-vdc-security-group
Org VDC Security Group

Tenants can create layer 3 (IP based) or layer 2 (MAC based) rules while using the following objects when defining them:

  •  IP address, IP/MAC sets
  • Virtual Machine
  • Org VDC Network
  • Org VDC

Note that using L3 non-IP based rules requires NSX to learn IP address(es) of the guest VM. One of the following mechanism must be enabled:

  • VMware Tools installed in guest VM
  • DHCP Snooping IP Detection Type
  • ARP Snooping IP Detection Type

IP Detection Type is configured in NSX at Cluster Level in Host Preparation tab.

host-preparation

ip-detection-type

Scope for each rule can be defined in Applied To column. As mentioned before by default it is set to the Org VDC, however tenant can further limit the scope of the rule to a particular VM, or Org VDC network (note that vApp network cannot be used). It is also possible to apply the rule to Org VDC Edge Gateway, in such case the rule is actually created and enforced on the Edge Gateway as pre-rule which has precedence over all other firewall rules defined at that Edge Gateway.

DFW Rule Applied to Edge GW
DFW Rule Applied to Edge GW

Tenant can enable logging of a specific firewall rule with API by editing <rule … logged=”true|false”> element. NSX then logs the first session packet matching the rule to ESXi host log with tenant specific tag (Org VDC UUID subset string). The provider can then filter such logs and forward them to tenants with its own syslog solution.

logging
NSX DFW Rule Tenant Tag

Setup Site-to-Site VPN between Azure and vCloud Director

My previous blog post was about setting up IPSec VPN tunnel between AWS VPC and vCloud Director Org VDC. This time I will describe how to achieve the same with Microsoft Azure.

vCloud Director is not among Azure list of supported IPSec VPN endpoints however it is possible to set up such VPN although it is not straightforward.

I will describe the setup of both Azure and VCD endpoints very briefly as it is very similar to the one I described in my previous article.

Azure Configuration

  • Resource Group (logical container object) – in my example RG UK
  • Virtual network (large address space similar to AWS VPN subnet) – 172.30.0.0/16
  • Subnets – at least one for VMs (172.30.0.0/24) and one for Gateway (172.30.255.0/29)
  • Virtual Network Gateway – Azure VPN endpoint with public IP address associated with the Gateway subnet above. Gateway type is VPN, VPN type is Policy-based (this is because Route-based type uses IKE2 which is not supported by NSX platform used by vCloud Director).
  • Local Network Gateway – vCloud VPN endpoint definition with its public IP address and subnets that should be reachable behind the vCloud VPN endpoint (81.x.x.x, 192.168.100.0/24)
  • Connection – definition of the tunnel:
    • Connection type: Site-to-site (IPSec)
    • Virtual network gateway and local network gateway are straightforward (those created previously)
    • Connection name: whatever
    • Shared Key (PSK): create your own 32+ character key using upper and lower case characters and numbers
  • Test VM connected to the VM subnet (IP 172.30.0.4)

azure-resources

vCloud Configuration

As explained above we created Policy Based VPN endpoint in Azure. Policy Based VPN uses IKE version 1, Diffie-Hellman Group 2 and no Perfect Forward Secrecy.

However selection of DH group and PFS is not available to tenant in vCloud Director on the legacy Org VDC Edge Gateway. Therefore the following workaround is proposed:

Tenant configures VPN on his Org VDC Edge Gateway with the following:

  • Name: Azure
  • Enable this VPN configuration
  • Establisth VPN to: a remote network
  • Local Networks: 192.168.100.0/24 (Org VDC network(s))
  • Peer Networks: 172.30.0.0/24
  • Local Endpoint: Internet (interface facing internet)
  • Local ID: 10.0.2.121 (Org VDC Edge GW internet interface)
  • Peer ID: 51.x.x.x (public IP of the Azure Virtual network gateway)
  • Peer IP: 51.x.x.x (same as previous)
  • Encryption protocol: AES256
  • Shared Key: the same as in Azure Connection definition

Now we need to ask the service provider to directly in NSX in the Edge VPN configuration disable PFS and change DH Group to DH2.

nsx-vpn

Note that this workaround is not necessary on Org VDC Edge Gateway that has been enabled with Advanced Networking services. This feature is at the moment only in vCloud Air, however soon will be available to all vCloud Air Network service providers.

If all firewall rules are properly set up we should be able to ping between Azure and vCloud VMs.

ping