Google Authentication with VMware Cloud Director (OAuth)

Several authentication mechanisms can be used for VMware Cloud Director users. The basic authentication is used for local (users stored in VCD database) and LDAP users. SAML authentication can be used for integration with SAML compatible Identity Providers such as Microsoft AD FS, IBM Cloud Identity, VMware Workspace ONE Access (VIDM). OAuth authentication is supported as well, but due to the fact you have to (currently as of VCD 10.2) use API to configure it, it is not that widely known.

In this article I will show an example of such configuration with VMware Identity Manager (VIDM) and with Google Identity IdP. Yes, with VIDM you have the option to use SAML or OAuth.

By default OAuth authentication can be enabled by the tenant at Organizational level and co-exist with local, LDAP and SAML identity sources. The OAuth authentication endpoint must be reachable from VCD Cells. This is a big difference compared to SAML authentication, which is performed via assertion token exchange via browser (only the client browser needs to reach the SAML IdP). Therefore OAuth is more suitable when public IdPs are used (e.g. Google) or provider managed ones (VCD cells can reach IdP internally).

VMware Identity Manager OAuth Configuration

Note I am using VIDM version 3.3.

  1. In VIDM as admin go to Catalog, Settings, Remote App Access and create a new Client
  2. Create the client. Pick unique Client ID, the redirect URL is https://vcd.example.com/login/oauth?service=tenant:<org name> or https://vcd.example.com/login/oauth?service=provider. Generate the shared secret and select Email, Profile, User and OpenID scopes.
  3. Now we need to find OAuth endpoints and public key. In my VIDM configuration this is can be found at https://vidm.example.com/SAAS/auth/.well-known/openid-configuration. This URL can differ based on VIDM / Workspace ONE Access version.
    The address returns a JSON response from which we need: issuer, authorization_endpoint, token_endpoint, userinfo_endpoint, scopes and claims supported.
    The link to the public key is provided in jwks_uri (https://vidm.example.com/SAAS/API/1.0/REST/auth/token?attribute=publicKey&format=jwks). We will need the key in PEM format, so you can either convert it (e.g. https://8gwifi.org/jwkconvertfunctions.jsp) or specify PEM format in  the link (&format=pem  at the end of the URI). We will also need KeyID (kid value) and key algorithm (kty).
  4. Now we have all necessary information to configure OAuth in VCD. We will use PUT /admin/org/{id}/settings/oauth API call. In the payload we will provide all data that we collected in steps #2 and #3. Here is an example I used:
    Note the OIDCAttributeMapping section. Here we must specify claims providing more information about the user. VIDM currently does not support groups and roles, so those are hardcoded. You can see what user information is sent by accessing UserInfoEndpoint. This can be done easily with Postman OAuth2 authentication, where you first obtain the Access Token (orange button) and then do a GET against the UserInfoEndpoint.
  5. Lastly we need to import some users. This is done with POST /admin/org/{id}/users API call with ProviderType set to OAUTH.

Now we can log in as the VIDM user.

Google Identity OAuth Configuration

  1. Head over to Credentials section of Google API & Services: https://console.developers.google.com/apis/credentials
  2. Create Project, configure Consent Screen, Scopes and test users
  3. Create OAuth Client ID. Use the redirect URI https://vcd.example.com/login/oauth?service=tenant:<org name> or https://vcd.example.com/login/oauth?service=provider. Note generated Client ID and secret.
  4. Google OAuth endpoints and public keys can be retrieved from: https://accounts.google.com/.well-known/openid-configuration
    You will need to get both public keys and convert them to PEM. Now we can configure the OAUTH in VCD.
PUT https://{{host}}/api/admin/org/b813a16e-6821-4dc5-994f-955b10155107/settings/oauth


<OrgOAuthSettings xmlns="http://www.vmware.com/vcloud/v1.5"                     type="application/vnd.vmware.admin.organizationOAuthSettings+xml">
    <IssuerId>https://accounts.google.com</IssuerId>
    <OAuthKeyConfigurations>
        <OAuthKeyConfiguration>
            <KeyId>eea1b1f42807a8cc136a03a3c16d29db8296daf0</KeyId>
            <Algorithm>RSA</Algorithm>
            <Key>-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0zNdxOgV5VIpoeAfj8TM
EGRBFg+gaZWz94ePR1yxTKzScHakH4F4wcMEyL0vNE+yW/u4pOl9E+hAalPa2tFv
4fCVNMMkmKwcf0gm9wNFWXGakVQ8wER4iUg33MyUGOWj2RGX1zlZxCdFoZRtshLx
8xcpL3F5Hlh6m8MqIAowWtusTf5TtYMXFlPaWLQgRXvoOlLZ+muzEuutsZRu+agd
OptnUiAZ74e8BgaKN8KNEZ2SqP6vE4w16mgGHQjEPUKz9exxcsnbLru6hZdTDvXb
X9IduabyvHy8vQRZsqlE9lTiOOOC9jwh27TXsD05HAXmNYiR6voekzEvfS88vnot
2QIDAQAB
-----END PUBLIC KEY-----</Key>
        </OAuthKeyConfiguration>
        <OAuthKeyConfiguration>
            <KeyId>03b2d22c2fecf873ed19e5b8cf704afb7e2ed4be</KeyId>
            <Algorithm>RSA</Algorithm>
            <Key>-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArKZ+1zdz/CoLekSynOty
Wv6cPSSkV28Kb9kZZHyYL+yhkKnH/bHl8OpWiGxQiKP0ulLRIaq1IhSMetkZ8FfX
H+iptIDu4lPb8gt0HQYkjcy3HoaKRXBw2F8fJQO4jQ+ufR4l+E0HRqwLywzdtAIm
NWmju3A4kx8s0iSGHGSHyE4EUdh5WKt+NMtfUPfB5v9/2bC+w6wH7zAEsI5nscMX
nvz1u8w7g2/agyhKSK0D9OkJ02w3I4xLMlrtKEv2naoBGerWckKcQ1kBYUh6WASP
dvTqX4pcAJi7Tg6jwQXIP1aEq0JU8C0zE3d33kaMoCN3SenIxpRczRzUHpbZ+gk5
PQIDAQAB
-----END PUBLIC KEY-----</Key>
        </OAuthKeyConfiguration>
    </OAuthKeyConfigurations>
    <Enabled>true</Enabled>
    <ClientId>**redacted**.apps.googleusercontent.com</ClientId>
    <ClientSecret>**redacted**</ClientSecret>
    <UserAuthorizationEndpoint>https://accounts.google.com/o/oauth2/v2/auth</UserAuthorizationEndpoint>
    <AccessTokenEndpoint>https://oauth2.googleapis.com/token</AccessTokenEndpoint>
    <UserInfoEndpoint>https://openidconnect.googleapis.com/v1/userinfo</UserInfoEndpoint>
    <Scope>email profile openid</Scope>
    <OIDCAttributeMapping>
        <SubjectAttributeName>email</SubjectAttributeName>
        <EmailAttributeName>email</EmailAttributeName>
        <FirstNameAttributeName>given_name</FirstNameAttributeName>
        <LastNameAttributeName>family_name</LastNameAttributeName>
        <GroupsAttributeName>groups</GroupsAttributeName>
        <RolesAttributeName>roles</RolesAttributeName>
    </OIDCAttributeMapping>
    <MaxClockSkew>600</MaxClockSkew>
</OrgOAuthSettings>
[/code]
  • With the same API as described in the step 5 of the VIDM configuration import your OAuth users.

Provider Networking in VMware Cloud Director

This is going to be a bit longer than usual and more of a summary / design option type blog post where I want to discuss provider networking in VMware Cloud Director (VCD). By provider networking I mean the part that must be set up by the service provider and that is then consumed by tenants through their Org VDC networking and Org VDC Edge Gateways.

With the introduction of NSX-T we also need to dive into the differences between NSX-V and NSX-T integration in VCD.

Note: The article is applicable to VMware Cloud Director 10.2 release. Each VCD release is adding new network related functionality.

Provider Virtual Datacenters

Provider Virtual Datacenter (PVDC) is the main object that provides compute, networking and storage resources for tenant Organization Virtual Datacenters (Org VDCs). When a PVDC is created it is backed by vSphere clusters that should be prepared for NSX-V or NSX-T. Also during the PVDC creation the service provider must select which Network Pool is going to be used – VXLAN backed (NSX-V) or Geneve backed (NSX-T). PVDC thus can be backed by either NSX-V or NSX-T, not both at the same time or none at all and the backing cannot be changed after the fact.

Network Pool

Speaking of Network Pools – they are used to create on-demand routed/isolated networks by tenants. The Network Pools are independent from PVDCs, can be shared across multiple PVDCs (of the same backing type). There is an option to automatically create VXLAN network pool with PVDC creation but I would recommend against using that as you lose the ability to manage the transport zone backing the pool on your own. VLAN backed network pool can still be created but can be used only in PVDC backed by NSX-V (same for very legacy port group backed network pool now available only via API). Individual Org VDCs can (optionally) override the Network Pool assigned of its parent PVDC.

External Networks

Deploying virtual machines without the ability to connect to them via network is not that usefull. External networks are VCD objects that allow the Org VDC Edge Gateways connect to and thus reach the outside world – internet, dedicated direct connections or provider’s service area. External network have associated one or more subnets and IP pools that VCD manages and uses them to allocate external IP addresses to connected Org VDC Edge Gateways.

There is a major difference how external networks are created for NSX-V backed PVDCs and for NSX-T ones.

Port Group Backed External Network

As the name suggest these networks are backed by an existing vCenter port group (or multiple port groups) that must be created upfront and is usually backed by VLAN (but could be a VXLAN port group as well). These external networks are (currently) supported only in NSX-V backed PVDCs. Org VDC Edge Gateway connected to this network is represented by NSX-V Edge Service Gateway (ESG) with uplink in this port group. The uplinks have assigned IP address(es) of the allocated external IPs.

Directly connected Org VDC network connected to the external network can also be created (only by the provider) and VMs connected to such network have uplink in the port group.

Tier-0 Router Backed External Network

These networks are backed by an existing NSX-T Tier-0 Gateway or Tier-0 VRF (note that if you import to VCD Tier-0 VRF you can no longer import its parent Tier-0 and vice versa). The Tier-0/VRF must be created upfront by the provider with correct uplinks and routing configuration.

Only Org VDC Edge Gateways from NSX-T backed PVDC can be connected to such external network and they are going to be backed by a Tier-1 Gateway. The Tier-1 – Tier-0/VRF transit network is autoplumbed by NSX-T using 100.64.0.0/16 subnet. The allocated external network IPs are not explicitly assigned to any Tier-1 interface. Instead when a service (NAT, VPN, Load Balancer) on the Org VDC Edge Gateway starts using assigned external address, it will be advertised by the Tier-1 GW to the linked Tier-0 GW.

There are two main design options for the Tier-0/VRF.

The recommended option is to configure BGP on the Tier-0/VRF uplinks with upstream physical routers. The uplinks are just redundant point-to-point transits. IPs assigned from any external network subnet will be automatically advertised (when used) via BGP upstream. When provider runs out of public IPs you just assign additional subnet. This makes this design very flexible, scalable and relatively simple.

Tier-0/VRF with BGP

An alternative is to use design that is similar to the NSX-V port group approach, where Tier-0 uplinks are directly connected to the external subnet port group. This can be useful when transitioning from NSX-V to T where there is a need to retain routability between NSX-V ESGs and NSX-T Tier-1 GWs on the same external network.

The picure below shows that the Tier-0/VRF has uplinks directly connected to the external network and a static route towards the internet. The Tier-0 will proxy ARP requests for external IPs that are allocated and used by connected Tier-1 GWs.

Tier-0 with Proxy ARP

The disadvantage of this option is that you waste public IP addresses for T0 uplink and router interfaces for each subnet you assign.

Note: Proxy ARP is supported only if the Tier-0/VRF is in Active/Standby mode.

Tenant Dedicated External Network

If the tenant requires direct link via MPLS or a similar technology this is accomplished by creating tenant dedicated external network. With NSX-V backed Org VDC this is represented by a dedicated VLAN backed port group, with NSX-T backed Org VDC it would be a dedicated Tier-0/VRF. Both will provide connectivity to the MPLS router. With NSX-V the ESG would run BGP, with NSX-T the BGP would have to be configured on the Tier-0. In VCD the NSX-T backed Org VDC Gateway can be explicitly enabled in the dedicated mode which gives the tenant (and also the provider) the ability to configure Tier-0 BGP.

There are seprate rights for BGP neighbor configuration and route advertisement so the provider can keep BGP neighbor configuration as provider managed setting.

Note that you can connect only one Org VDC Edge GW in the explicit dedicated mode. In case the tenant requires more Org VDC Edge GWs connected to the same (dedicated) Tier-0/VRF the provider will not enable the dedicated mode and instead will manage BGP directly in NSX-T (as a managed service).

Often used use case is when the provider directly connects Org VDC network to such dedicated external network without using Org VDC Edge GW. This is however currently not possible to do in NSX-T backed PVDC. There instead, you will have to import Org VDC network backed by NSX-T logical segment (overlay or VLAN).

Internet with MPLS

The last case I want to describe is when the tenant wants to access both Internet and MPLS via the same Org VDC Edge GW. In NSX-V backed Org VDC this is accomplished by attaching internet and dedicated external network portgroups to the ESG uplinks and leveraging static or dynamic routing there. In an NSX-T backed Org VDC the provider will have to provision Tier-0/VRF that has transit uplink both to MPLS and Internet. External (Internet) subnet will be assigned to this Tier-0/VRF with small IP Pool for IP allocation that should not clash with any other IP Pools.

If the tenant will have route advertisement right assigned then route filter should be set on the Tier-0/VRF uplinks to allow only the correct prefixes to be advertised towards the Internet or MPLS. The route filters can be done either in NSX-T direclty or in VCD (if the Tier-0 is explicitly dedicated).

The diagram below shows example of an Org VDC that has two Org VDC Edge GWs each having access to Internet and MPLS. Org VDC GW 1 is using static route to MPLS VPN B and also has MPLS transit network accessible as imported Org VDC network, while Org VDC GW 2 is using BGP to MPLS VPN A. Connectivity to the internet is provided by another layer of NSX-T Tier-0 GW which allows usage of overlay segmens as VRF uplinks and does not waste physical VLANs.

One comment on usage of NAT in such design. Usually the tenant wants to source NAT only towards the Internet but not to the MPLS. In NSX-V backed Org VDC Edge GW this is easily set on per uplink interface basis. However, that option is not possible on Tier-1 backed Org VDC Edge GW as it has only one transit towards Tier-0/VRF. Instead NO SNAT rule with destination must be used in conjunction with SNAT rule.

An example:

NO SNAT: internal 10.1.1.0/22 destination 10.1.0.0/16
SNAT: internal 10.1.1.0/22 translated 80.80.80.134

The above example will source NAT 10.1.1.0 network only to the internet.

VMware Cloud Director Cells Behind Internet Proxy

VMware Cloud Director cells are usually deployed in the management cluster and their access to Internet might be limited due to security considerations. This can be a problem because certain features do require outgoing access to external (Internet) resources:

  • Catalog subscription: the cell will need access to the published catalog URL
  • Multisite: if you associate multiple Organizations together, some API calls are fan-out by the cell to the respective associated API endpoints, therefore the cell needs to be able to access them (even its own external API endpoint)
  • Cell Appliance VAMI repository for patches or upgrades

The latest VCD release 10.2.1 now does support internet proxy which means there is no need to have full internet access to the management environment.

On the VCD Appliance the proxy can be configured by editing /etc/sysconfig/proxy file:

 

root@vcloud1 [ ~ ]# cat /etc/sysconfig/proxy
# Enable a generation of the proxy settings to the profile.
# This setting allows to turn the proxy on and off while
# preserving the particular proxy setup.
#
PROXY_ENABLED="yes"

# Some programs (e.g. wget) support proxies, if set in
# the environment.
# Example: HTTP_PROXY="http://proxy.provider.de:3128/"
HTTP_PROXY="http://proxy.fojta.com:3128"

# Example: HTTPS_PROXY="https://proxy.provider.de:3128/"
HTTPS_PROXY="http://proxy.fojta.com:3128"

You need to restart vmware-vcd service to apply the configuration.

NSX-T 3.1: Sharing Transport VLAN between Host and Edge Nodes

When NSX-T 3.1 was released a few days ago, the feature that I was most looking for was the ability to share Geneve overlay transport VLAN between ESXi transport nodes and Edge transport nodes.

Before NSX-T 3.1 in a collapsed design where Edge transport nodes were running on ESXi transport nodes (in other words NSX-T Edge VMs were deployed to NSX-T prepared ESXi cluster) you could not share the same transport (TEP) VLAN unless you would dedicate separate physical uplinks for Edge traffic and ESXi underlay host traffic. The reason is that the Geneve encapsulation/decapsulation was happening only on the physical uplink in/egress and that point would be skipped for intra-host datapath between the Edge and host TEP VMkernel port.

This was quite annoying because the two transport VLANs need to route between each other at full jumbo MTU>1600 frame size. So in lab scenarios you had to have additional router taking care of that. And I have seen multiple time issues due to  misconfigured router MTU size.

After upgrading my lab to NSX-T 3.1 I was eager to test it.

Here are the steps I used to migrate to single transport VLAN:

  1. The collapsed Edge Nodes will need to use trunk uplinks created as NSX-T logical segment. My Edge Nodes used regular VDS port group so I renamed the old ones in vCenter and created new trunks in NSX-T Manager.
  2. (Optional) Create new TEP IP Address Pool for the Edges. You can obviously use the ESXi host IP Pool as now they will share the same subnet, or you can use static IP addressing. I opted for new IP Address Pool with the same subnet as my ESXi host TEP IP Address Pool but a different range so I can easily distinguish host and edge TEP IPs.
  3. Create new Edge Uplink Profile VLAN to match the ESXi transport VLAN.
  4. Now for each Edge node repeat this process: edit the node in the Edge Transport Node Overview tab, change its Uplink Profile, IP Pool and uplinks to the created ones in steps #1, #2 and #3. Refresh and observe the Tunnel health.
  5. Clean up now unused Uplink Profile, IP Pool and VDS uplinks.
  6. Deprovision now unused Edge Transport VLAN from physical switches and from the physical router interface.

During the migration I saw one or two pings to drop but that was it. If you see tunnel issues try to put the edge node briefly into NSX Maintenance Mode.

Quotas and Quota Policies in VMware Cloud Director

In this article I want to highlight a new neat feature in VMware Cloud Director 10.2 – the ability to assign quotas and create quota policies.

This can be done at multiple levels both by service provider or organization administrator.

The following resources today can be managed via quotas:

  • Memory
  • CPU
  • Storage
  • All VMs (includes vApp template VMs)
  • Running VMs
  • TKG Clusters

The list might expand in the future so you can easily find what quota capabilities are available via API.

The service provider can create quotas at the organization level in the Organization > Configure > Quotas section:

The org administrator can assign quota to individual users or groups. This is done from the Administration > Access Control > User or Group  > Set Quota section.
The assignment of a quota at the group level is inherited by each group user (so it is not enforced at the aggregate group level) but can be overridden at the individual user quota level. Also if a user is member of multiple groups the least restrictive combination of participating group quotas will be applied to her.

At the same place the user or org admin can see the actual user’s usage compared to the quota.

Org admins can use quotas to easily control good behavior of org users (not running too many VMs concurrently, not consuming too much storage, etc.), while system admins can set safety quotas at org level when using Org VDC allocation models with unlimited consumption with Pay per use billing.

One hidden feature available only via API is the ability to create more generic quota policies that can combine (pool) multiple quota elements and use those to assign them to organizations, groups or individual users. Think of quota policy: Power User vs Regular User, where the former can power on more VMs.

When a specific quota is assigned at the user/group/org object, quota policy is created in the backend anyway but is specific just to the one object, while edit of Power User quota policy would be applied to every user that has such quota policy.

The feature comes with new specific rights so can be easily enabled or disabled:

  • Organization: Manage Quotas of Organization
  • Organization: Edit Quotas Policy
  • General: View Quota Policy Capabilities
  • General: Manage Quota Policy
  • General: View Quota Policy