Multitenant Service Network in vCloud Director

Service providers often have to provide additional services to their cloud tenants. An example is providing licensing services (KMS) for Windows VMs deployed from provider managed catalog or RHEL Satellite servers for licensing and patching Red Hat VMs. The questions is then where to deploy these shared services virtual machines so they are securely available in multitenant environment?

In my older blog post Centralized Logging in vCloud Director Environments I described how a shared vCloud Director external logging network can be used to collect logs from Edge Gateways. So the idea is to use the same network for connection to the shared services VMs (KMS/Satellite) running in Administration Organization. The Edge Gateway can have only 10 interfaces so it is good that we do not waste another one. Let’s have a look at following diagram:

Edge GW Logging and admin services

We have 3 organizations and one Org VDC in each – Customer 1, Customer 2 (the tenants) and Admin Organizations (managed by the provider). The tenants connect their vApps to the shared internet network (yellow) via the Edge Gateways by using sub-allocated public addresses (8.8.8.x) utilizing source or destination NAT of their Org VDC network. Each Edge Gateway is connected to another vCloud external network (black) that is using both for Edge logging and access to shared services running in the Admin Organization.

Notice that there are two IP subnet ranges assigned to the service external network. The 10.0.5.0/24 is used solely for the Edge logging. The syslog server sits in this network (10.0.5.254) and firewall infront of it ensures that only Edge logs get there. The Edge Gateway IP from this network (10.0.5.1 and 10.0.5.2) is not sub-allocated for tenant use so they cannot create NAT rules with it. They could only route (one way) from their Org VDC networks and send UDP packets but the syslog firewall denies such traffic as it is coming from internal (192.168.1.2) IPs.

The second IP subnet range of the service network (172.16.254.0/24) is used for the communication to the service VMs running in Admin Organization. So how is this achieved securely?

  1. The provider sub-allocates the Edge IP to the tenant so he can create NAT rules. So 172.16.254.1 is sub-allocated to Customer 1, 172.16.254.2 is sub-allocated to Customer 2.
  2. The provider pre-creates SNAT rule for each deployed Edge Gateway. The rule must be applied on the Service network, original IP range is everything 0.0.0.0/0 and translated IP is the sub-allocated IP of the Edge.
    SNAT ruleThe tenant has to be told not to delete or alter the rule otherwise his access to shared services will not work anymore.
  3. The provider creates destination NAT rule for his service VMs running in Admin Organization. To do this he first needs to have sub-allocated IP addresses (in my example 172.16.254.3 and 172.16.254.4) and then DNATs them to the VM internal IPs 192.168.1.2 and 192.168.1.3. Obviously port forwarding could be used as well to save some IPs as long the port numbers of the services are not the same.

That’s it. Any traffic from the tenant’s VM to the external IP address of the service VM (e.g. 172.16.254.3) will be SNATed by the tenant Edge GW and DNATed by the Admin Edge GW and securely delivered without the tenants being able to contact each other (unless the create DNAT rules as well which could be prevented by MAC ACLs on the external network).

I would also advise to use some obscure IP ranges for the service network so they do not overlap with customer defined Org VDC network ranges.

VCP-NV Exam Experience

VCP-NVDuring VMworld 2014 VMware released new certification track – Network Virtualization. There is already quite a big number of bootstrapped VCDX-NVs which is the highest certification level and it is also nowNV certification track possible to schedule the entry level VCP-NV exam.

 

As I think that NSX is a great technology I am going for this certification track and immediately scheduled VCP-NV in my nearby PearsonVue test center and today took the exam.

While not having much time for preparation I obviously downloaded the exam blueprint and was surprised how extensive it is –  nine objective categories ranging from the NSX architecture, VXLAN, distributed routing and firewalling, Edge services up to service composer, vSphere standard and distributed switch features and vCloud Automation Center integration. From the sheer content it looks like it is not going to be a simple exam.

I have been working with NSX for some time so was pretty confident in all the areas. Prior the exam I reviewed those areas I work less with (Service Composer, Activity Monitoring, dynamic routing protocols – BGP, IS-IS) and went through the packet walks (VM, VTEP, Controller, Multicast, Unicast, etc) for switched, routed and bridged traffic.

In April I passed Cisco CCNA certification so this gave me good opportunity to compare these two entry level networking exams from two major vendors with completely different SDN strategy.

VCP-NV is obviously heavily based on VMware NSX so do not expect much OpenFlow SDN or any Cisco ACI there. Compared to CCNA there is also no basic network theory (subnetting, OSI model, protocols). There are 120 questions in 2 hour time window which is quite a lot. But all are multiple choice questions – no CLI simulators or flash based questions. The questions cover all blueprint areas and my assumption is they are up to the level of VMware NSX: Install, Configure, Manage training which I did not take (only its VMware internal bootcamp predecessor). I was able to go through the test quite quickly – there is usually no reason to dwell on a particular question longer than 30s. You either know the answer or not.

The questions were mostly clearly written which made taking the exam quite enjoyable experience (well it might have been shorter). You get the result immediately and in my case it was a pass.

My recommendation for potential candidates: know vSphere networking (including the advanced features – NetFlow, Port Mirroring, …), have hands on experience with NSX – if you cannot get the bits or do not have a lab use the NSX Hands-On Labs, which are really good and lastly take the NSX ICM course!

Now back to my VCDX-NV design…

vCloud Usage Meter with Signed SSL Certificates

VCUMvCloud Usage Meter is a small virtual appliance used by service providers to measure their VMware product consumption for VSPP (VMware Service Provider Program) type licensing.

I needed to replace the self signed certificate of the web user interface. While there is a KB article 2047572 and also a chapter in the user guide dedicated to the subject neither was correct for my version 3.3.1 installation.

The web interfaces is provided by tc server which stores its certificate keystore in the following location:

/usr/local/tcserver/vfabric-tc-server-standard/um/conf/tcserver.jks

The keystore password is silverpen and the certificate alias is um. The location and password can be changed by editing server.xml in the same directory.

Here is a quick guide how to generate and sign new certificate with java keytool. Note if you need to generate private key externally use the steps described in my older article here.

  1. Modify default path to include java keytool location:
    export PATH=$PATH:/usr/java/latest/bin 
  2. Go to tc server conf folderd
    cd /usr/local/tcserver/vfabric-tc-server-standard/um/conf/ 
  3. Backup current keystore
    mv tcserver.jks tcserver.jks.backup 
  4. Generate private key. When asked always use password silverpen
    keytool -genkey -alias um -keyalg RSA -keysize 2048 -keystore tcserver.jks 
  5. Modify ownership of the keystore file:
    chown usgmtr tcserver.jks 
  6. Create certificate signing request
    keytool -certreq -alias um -keyalg RSA -file vcum.csr -keystore tcserver.jks 
  7. Sign CSR with your CA (save certificate as vcum.crt)
  8. Import root (and optionally intermediate) certificates if needed
    keytool -import -trustcacerts -alias root -file fojta-dc-CA.cer -keystore tcserver.jks 
  9. Import the signed certificate
    keytool -import -alias um -file vcum.crt -keystore tcserver.jks 
  10. Verify certificates were successfully imported into keystore
    keytool -list -keystore tcserver.jksKeystore type: JKS

    Keystore provider: SUN
    Your keystore contains 2 entries

    root, Aug 1, 2014, trustedCertEntry,
    Certificate fingerprint (MD5): E3:EE:7F:47:1A:3E:76:07:8F:27:5D:87:54:94:A4:E7
    um, Aug 2, 2014, PrivateKeyEntry,
    Certificate fingerprint (MD5): 26:3C:96:08:63:86:2B:E8:CA:2C:7F:53:6A:B2:EE:FA

  11. Restart tc service
    service tomcat restart

 

Load Balancing vCloud Director Cells with NSX Edge Gateway

About two years ago I have written article how to use vShield Edge to load balance vCloud Director cells. I am revisiting the subject; however this time with NSX Edge load balancer. One of the main improvements of the NSX Edge load balancer is SSL termination.

Why SSL Termination?

All vCloud GUI or API requests are made over HTTPS. When vShield (vCNS) Edge was used for load balancing it was just passing through the traffic untouched. There was no chance to inspect the request – the load balancer saw only source and destination IP and encrypted data. If we want to inspect the HTTP request we need to terminate the SSL session on the load balancer and then create a new SSL session towards the cell pool.

SSL Termination

 

This way we can filter URLs, modify the header or do even more advanced inspection. I will demonstrate how we can easily block portal access for a given organization and how to add X-Forwarded-For header so vCloud Director can log the actual end-user’s and not only load balancer’s IP addresses.

Basic Configuration

I am going to use exactly the same setup as in my vShield article. Two vCloud Director cells (IP addresses 10.0.1.60-61 and 10.0.1.62-63) behind Virtual IPs – 10.0.2.80 (portal/API) and 10.0.2.81 (VMRC).

vCloud Director Design

While NSX Edge load balancer is very similar to vShield load balancer the UI and the configuration workflow has changed quite a bit. I will only briefly describe the steps to set up basic load balancing:

  1. Create Application Profiles for VCD HTTP (port 80), VCD HTTPS (port 443) and VCD VMRC (port 443). We will use HTTP, HTTPS and TCP types respectively. For HTTPS we will for now enable SSL passthrough.
  2. Create new Service Monitoring (type HTTPS, method GET, URL /cloud/server_status)
    Service Monitor
  3.  Create server Pools (VCD_HTTP with members 10.0.1.60 and 62, port 80, monitor port 443; VCD_HTTPS with members 10.0.1.60 and 62, port 443, monitor port 443 and VCD_VMRC with members 10.0.1.61 and 63, port 443, monitor port 443). Always use monitor created in previous step. I used Round Robin algorithm.
    Pools
  4. Create Virtual Servers for respective pools, application profiles and external IP/port (10.0.2.80:80 for VCD_HTTP, 10.0.2.80:443 for VCD_HTTPS and 10.0.2.81:443 for VCD_VMRC).
  5. Enable load balancer in its Global Configuration.
    Global LB Config

Now we should have load balanced access to vCloud Director with identical functionality as in vShield Edge case.

Advanced Configuration

Now comes the fun part. To terminate SSL session at the Edge we need to create and upload to the load balancer vCloud http SSL certificate. Note that it is not possible to terminate VMRC proxy as it is a poor socket SSL connection. As I have vCloud Director 5.5 I had to use identical certificate as the one on the cells otherwise catalog OVF/ISO upload would fail with SSL thumbprint mismatch (see KB 2070908 for more details). The actual private key, certificate signing request and certificate creation and import was not straightforward so I am listing exact commands I used (do not create CSR on the load balancer as you will not be able to export the key to later import it to the cells):

  1. Create private key with pass phrase encryption with openssl:
    openssl genrsa -aes128 -passout pass:passwd -out http.key 2048
  2. Create certificate signing request with openssl:
    openssl.exe req -new -in http.key -out http.csr
  3. Sign CSR (http.csr) with your or public Certificate Authority to get http.crt.
  4. Upload the certificate and key to both cells. See how to import private key in my older article.
  5. Import your root CA and http certificate to the NSX Edge (Manager > Settings > Certificates).
    Certificates

Now we will create a simple Application Rule that will block vCloud portal access to organization ACME (filter /cloud/org/acme URL).

acl block_ACME path_beg – i /cloud/org/acme
block if block_ACME

Application Rule

Now we will change previously created VCD_HTTPS Application Profile. We will disable SSL Passthrough, check Insert X-Forwarded-For HTTP header (which will pass to vCloud Director the original client IP address) and Enable Pool Side SSL. Select previously imported Service Certificate.

Application Profiles

And finally we will assign the Application Rule to the VCD_HTTPS Virtual Server.

Virtual Servers

Now we can test if we can access vCloud Director portal and see the new certificate, we should not be able to access vCloud Director portal for ACME organization and we should see in the logs the client and proxy IP addresses.

ACME Forbidden

Client IP

For more advanced application rules check HAProxy documentation.

vCloud Connector in Multisite Environment with Single VC

I have received an interesting question. Customer has single vCenter Server environment with large number of sites. Can they use vCloud Connector Content Sync to keep templates synced among the sites?

The answer is yes but the set up is not so straight forward. vCloud Connector client does not allow registration of multiple nodes pointing to the same end-point (vCenter Server). The workaround is to fool vCloud Connector by using different FQDNs for the same end-point (IP address). Example: vcenter01.fojta.com, vcenter01a.fojta.com, vcenter01b.fojta.com all pointing to the same IP address. This will obviously impact SSL certificate verification which must be turned of or the vCenter Server certificates must include all those alternative names in S.A.N. attribute.

So the customer would deploy vCloud Connector node in each site, register it with the central vCenter Server always using different FQDN. Then in vCloud Connector client each site would look like a different vCenter Server and the Content Sync could be set up.