vCloud Director 9.5 and VMware Identity Manager Integration

About six months ago I blogged about VMware Identity Manager (VIDM) federation with vCloud Director. That article is still fully valid (and start there if you have not read it yet), however with the introduction of the new tenant HTML 5 user interface I want to describe how you can now chose which UI (legacy or new HTML5) the user will be redirected to.

When a vCloud Director organization is federated with an external IdP there are two different workflows for the login process:

  • In the first workflow the user goes to vCloud Director URL and is redirected to the external IdP to authenticate. After the authentication the user is redirected back to vCloud Director. Now depending on which URL the user initially used, she will be redirected to legacy UI (https://vcloud.example.com/cloud/org/coke) or HTML 5 UI (https://vcloud.example.com/tenant/coke).
  • In the second workflow, the user authenticates to the external IdP first and then is presented with catalog of federated apps and accessible through Single SignOn. Below is an example of VMware Workspace One catalog.

    Clicking a tile with of an app will redirect and sign-in the user directly to the particular app.

The VIDM integration as described in the previous post will however always redirect the user to the legacy UI. So how to force the usage of the new HTML 5 UI?

The is done by adding the Relay State URL to the config of the Web App in VIDM. The tricky part is that (at least as of version 9.5) vCloud Director expects the parameter to be Base64 encoded.

So in my example, the HTML 5 URL for the particular organization I want the user to be redirected to is: https://vcloud.fojta.com/tenant/coke which is Base64 encoded to: aHR0cHM6Ly92Y2xvdWQuZm9qdGEuY29tL3RlbmFudC9jb2tl and that is what must be entered in the Relay State URL field.

I can now create two Web App tiles for the user, so she can choose to which UI to go.

How To Disable Local System Administrator Accounts in vCloud Director

For some time there has been a hidden security feature in vCloud Director that allows disabling local system administrator accounts. During vCloud Director installation a default local system administrator account is created. The user credentials are stored encrypted in the vCloud Director database but there is no way to enforce complex password policies other than Account Lockout Policy.

It is possible to configure external identity sources such us generic LDAP for basic authentication and SAML2 IdP (such as vCenter SSO). The authentication and thus also the password policies are than managed externally. However, when you try to delete or disable all local system administrator accounts you will get the following error:

Cannot delete or deactivate the last system administrator.

This is a built in protection against completely locking yourself out when the external identity sources are not available.

Some customers can have the need to enforce strict security rules on all vCloud Director system administrator logins. There is a non-documented way to disable all local system administrator accounts with a single command. The system administrator can run the following cell-management-tool  command to enable config property local.sysadmin.disabled.

$VCLOUD_HOME/bin/cell-management-tool manage-config -n local.sysadmin.disabled -v true

Immediately after the property is enabled, authentication with local accounts will stop working. It means authentication for all local system administrator accounts that exist in vCloud Director (not just the default account created during installation) will be rejected. Organization local accounts will not be affected.

In case access to external IdPs is lost, the system admin can again disable the property to regain access to vCloud Director:

$VCLOUD_HOME/bin/cell-management-tool manage-config -n local.sysadmin.disabled -v false

vCloud Director 9: SAML2 Federation for System Administrators

In the past in vCloud Director 8.20 (and older versions) system admins (the provider context) could use local, LDAP and vSphere SSO accounts. vCloud Director 9.0 now replaces vSphere SSO accounts with more generic SAML2 accounts which means you can have the same IdP mechanism in the tenant and system context.

This change however breaks the previous vSphere SSO federation which was as simple as entering the vSphere Lookup Service URL and enabling the vSphere Single Sign-On with a check box (which in vCloud Director 9.0 is no longer there).

Here is the procedure how to enable vSphere Single Sign-On in vCloud Director 9.0.

  1. Login to vCloud Director as system admin and from administration>System Settings/Federation download the metadata document (spring_saml_metadata.xml) from the link provided (../cloud/org/System/saml/metadata/alias/vcd). Make sure the certificate (below) is valid.
  2. Login to vSphere Web Client as SSO admin and go to Administration/Single Sign-On/Configuration/SAML Service Providers
  3. Import the metadata from step #1
  4. Download the vsphere.local.xml metadata from the link below.
  5. Go back to VCD, check use SAML Identity Provider and upload metadata from #4.

Note that Import Users/Group source now changes from vSphere SSO to SAML.

vCloud Director 8.20
vCloud Director 9.0

 

Update 6-27-2018

Some additional notes about issues you might experience in order to get proper functionality of vCenter SSO federation:

  • Make sure that Public Addresses section contains correct FQDN of vCloud Director (pointing to the VIP of the load balancer)
  • Also make sure that the full certificate chain is uploaded as well (cert+intermediate+root)

  • Make sure vCloud Director is registered in vCenter SSO Lookup Service (Federation section – vSphere Services)
  • If you change vCloud Director public name or certificate, re-register vCloud Director to Lookup Service
  • If you change vCloud Director public name you must regenerate federation certificate by clicking Regenerate button to update endpoint addresses in the Metadata document.
  • The federation certificate has 1 year duration. You can use vCloud API to upload your own certificate with extended duration (PUT /admin/org/{id}/settings/federation)

Update 2/22/2021

vSphere 7 does not have the UI to export IdP and import service provider SAML metadata. You can still use vCenter CLI to achieve the same:

/opt/vmware/bin/sso-config.sh -get_sso_saml2_metadata -t vsphere.local /tmp/vsphere.local.xml
/opt/vmware/bin/sso-config.sh -register_sp -t vsphere.local /tmp/spring_saml_metadata.xml

Load Balancing HA vCenter Single Sign-On with NSX

NSX LBOne of the deployment options for vCenter Single Sign-On 5.5 (SSO) is high availability mode. It usually consists of two load balanced SSO nodes deployed in single site configuration. It is quite complex to set up and manage therefore I usually advise customers to avoid such configuration and instead co-deploy SSO together with vCenter Server on the same virtual machine – this results in the same availability of vCenter Service and SSO.

However there are reasons when you cannot do this and need to deploy highly available SSO instance. One is if you want to have multiple vCenter Servers in the same SSO domain with single pane of glass Web Client management. Another is vRealize Automation (vRA) deployment which also requires SSO.

VMware has published two whitepapers about the topic. The first VMware vCenter Server 5.5 Deploying a Centralized VMware vCenter Single Sign-On Server with a Network Load Balancer unfortunately unnecessarily adds even more complexity to the whole process. The paper also describes actve – active load balancing of the nodes which is however unsupported configuration (see here). While active – active load balancing might work with vCenter Server services it does not work with vRealize Automation (vCAC). This is due to the tokens used for solution authentication – WS Trust tokens are stateless but WebSSO are not. Also from what I heard vSphere 6 will not work in active – active configuration at all.

The second whitepaper Using VMware vCenter SSO 5.5 with VMware vCloud Automation Center 6.1 is more recent and while you see vCAC/vRA in its title it still very much applies for pure vSphere environments as well (skip the vRA specific chapters) and it is the one I would recommend. It also describes Active – Passive configuration of F5 Load Balancer.

The topic of this article is however usage of NSX load balancer instead of F5. Contrary to vCNS load balancer, NSX can be configured in Active – Passive mode and thus you can create supported HA SSO configuration with pure VMware solutions.

I will not go too deep in the SSO specific configurations in HA setup (did I mentioned it is complex?) as it is very well described in the second whitepaper mentioned above – instead I will focus on the NSX part of the configurations.

The architecture is like this: two SSO nodes with dedicated NSX load balancer in proxy – on a stick mode. This means LB is not inline of the traffic but instead has only 1 interface and SNAT and DNATs the traffic to the nodes. While inline transparent mode configuration is also possible I believe on a stick config is simpler and provides better resiliency (dedicated LB appliance for each application).

Here are the steps for NSX load balancer configuration:

  1. Deploy Edge Service Gateway for the Load Balancer with one interface preferably in the same subnet as SSO nodes.
  2. Enable Load Balancer feature
    NSX_LB
  3. Upload CA certificate and SSO certificate. See the second whitepaper on how to create SSO certificate.
    Certificates
  4. Configure service monitoring. While you could use the default TCP healh check, I prefer custom HTTPS type healthcheck which is monitoring /lookupservice URL.Service Monitoring
  5. Create Application Profile. During the SSO node configuration before the custom certificates are exchanged on each node you would use simple TCP profile or perhaps SSL passthrough profile (as the SSL certificate configured in NSX would not match self-signed certificate on the nodes). Another alternative is to edit /etc/hosts on each SSO node to fake the VIP hostname to point to the node (this is described in the first white paper). Once you replace the certificates on the nodes you can use SSL termination on the load balancer, configure VIP certificate and Pool Side certificate and also enable Insert X-Forwarded-For HTTP header so in theory we would see from where the authentication request is coming from (unfortunately SSO access log does not display the information). Application Profile
  6. Create Application Rule. Here we will define the logic that will perform the active – passive load balancing. Each SSO node will be in separate pool, with the primary node set up as default. ACL rule is defined to see if the primary node is up. If not we will switch the backend pool to the secondary node. The pool names must match the ones we will create in the next step.

    # detect if pool “SSO_primary” is still UP
    acl SSO_primary_down nbsrv(SSO_primary) eq 0
    # use pool “SSO_secondary” if “SSO_primary” is dead
    use_backend SSO_secondary if SSO_primary_down
    Applicaiton Rule

  7. Create SSO_primary and SSO_secondary pools. Each will have one SSO node with the healthcheck from step 4 and ports 7444. Notice that I have defined the pool member as vCenter VM container object so NSX will retrieve it’s IP address dynamically via VM Tools. While I could hardcode the node IP address this is nice showcase of NSX – vCenter integration. If inline mode you would check the Transparent checkbox for each pool.
    Pool
  8. Now we can create virtual server. We will select Application Profile from step 5, Default Pool from step 7, in the Advanced Tab Application Rule from step 6. For VIP I used the LB default IP (from step 1) and HTTPS 7444 port.
    Virtual Server
  9. As a last step do not forget to disable firewall or create firewall rule for the IP and port define in the previous step.

vCloud Director – vCenter Single Sign-On Integration Troubleshooting

The access to vCloud Director provider context (system administrator accounts) can be federated with vCenter Single Sign-On. This means that when vCloud system administrator wants to log into the vCloud Director (http://vCloud-FQDN/cloud) he is redirected to vSphere Web Client where he needs to authenticate and then redirected back to vCloud Director.

VCD_SSO_Federation

Here follows collection of topics that might be useful when dealing when troubleshooting the federation:

1. When the SSO federation is enabled the end-user is always redirected to vSphere Web client. If you want to use local authentication use http://vCloud-FQDN/cloud/login.jsp and type local or LDAP account (if LDAP is configured).

2. If you enabled SSO federation and the vCenter SSO is no longer available, you cannot unregister its lookup service. To do this go to vCloud database and in dbo.config table clear the lookupservice.url value.

3. In case you are using self-signed untrusted certificate on the vSphere web client some browsers (Firefox) might not display the Add Security Exception button when being redirected. Therefore open first the vSphere web client page directly, create the security exception and then the redirection from vCloud website should work.

4. HTTP ERROR 500. Problem accessing /cloud/saml/HoKSSO/alias/vcd. Reason: Error determining metadata contracts
….
Metadata for issuer https://xxx:7444/STS wasn’t found

This error might appear after the vSphere web client SSO authentication when the user is redirected back to vCloud portal. To understand this error let’s first talk about what is going on in the background. When the SSO federation is enabled, vCloud Director establishes trust with vCenter SSO. The trust is needed so the identity provider (SSO) knows that the request for authentication is not malicious (phishing) and also the service provider (vCloud Director) needs to be sure the reply comes from the right identity provider.

The trust is established when the federation is configured with metadata exchange that contains keys and some information about the other party. The SSO metadata can be seen in vCloud Director database in the dbo.saml_id_provider_settings table. Now during the actual authentication process if for some reason the security token reply comes from different source than the one expected based on the identity provider metadata, you will get this error.

This issue might happen for example if the vCenter SSO hostname has been changed. In this particular case which I encountered it happened on vCenter Server Appliance 5.1. The SSO has been initiated before the hostname was set. So the identity provider response came with metadata containing the issuer’s IP address instead FQDN which the service provider (VCD) expected based on the SSO endpoint address. The issuer information did not get updated after the hostname change.

The VCSA 5.1 the issuer information is stored in SSO PostgreSQL DB. These are the steps to change it:

  1. Get the SSO DB instance name, user and password

cat /usr/lib/vmware-sso/webapps/lookupservice/WEB-INF/classes/config.properties

db.type=postgres
db.url=jdbc:postgresql://localhost:5432/ssodb
db.user=ssod
db.pass=rTjsz9PRPLOFswBkJTCNAAVp
`

  1. Connect to the DB with the password retrieved from the previous step (db.pass=…)

/opt/vmware/vpostgres/1.0/bin/psql ssodb -U ssod

Retrieve the STS issuer:

ssodb=> select issuer from ims_sts_config;

If the issuer is really incorrect update it with following command:

ssodb=> update ims_sts_config SET issuer='https://FQDN:7444/STS'

Note: I need to credit William Lam for helping me where to find the SSO DB password.