SSO for vCloud Availability Portal UI

This is a quick followup on my yesterday’s blog post that discussed how to customize vCloud Director UI with additional links. vCloud Availability has separate Portal UI where the users can monitor status of their replications and optionally trigger failover operations. Wouldn’t it be nice if the link from vCloud Director UI would automatically sign in the user into the vCloud Availability Portal UI?

Quick chat with the engineers showed that indeed it is possible by leveraging the {vcdSession} variable that provides the vCloud Director session token. The URL provided in the link then must look like this:

https://<vCloud_Availability_Portal_UI_FQDN >:8443/login?token={vcdSession}

In my case the CMT command for the whole link would look like this:

./cell-management-tool manage-config -n ui.tenant.customOrgLinks -v "
# vCloud Availability
[Monitor Replications](https://vcloud.fojta.com:8443/login?token={vcdSession})"

And this is the end result:

Click on the Monitor Replications link above (red box) opens vCloud Availability Portal screen with the tenant signed, in the next browser tab (below).

How to Customize vCloud Director UI

Service providers who are offering additional services beyond vanilla vCloud Director IaaS were asking how to add links to them in the existing (Flex) vCloud Director UI.

vCloud Director 8.20 provides very simple way to extend the right column of the Home screen with additional sections and static links. It is really simple extensibility and should be used as interim solution until the new HTML 5 UI will fully replace the existing UI and which will be more extensible.

In the screenshot below you can see that the right section has been extended with vCloud Availability, Backup and Other sections.

The configuration of these links is very simple and is done from cell-management-tool on any vCloud cell.

In my example I used:

./cell-management-tool manage-config -n ui.tenant.customOrgLinks -v “
# vCloud Availability
[Monitor Replications](https://vcloud.fojta.com:8443)
# Backup
[Configure Backup](https://backup.fojta.com)
# Other
[Request Support](https://help.fojta.com)
[Impressum](https://www.fojta.com/impressum)”

Where # denotes the section header, [] the link name and () the link.

It is also possible to pass vCloud session ID as parameter in the URL by including {vcdSession} string.

The CMT manage-config command creates/modifies database entry in the config table tenant-customOrgLinks with the provided value in the quotes. Re-running it will replace the previous entry. The change is immediate, no need to run this on other cells or restart vcd services.

One last note, the right column on the home screen is not visible to all user roles. The role needs to have General > Administrator Control right.

Architecting a VMware vCloud Availability for vCloud Director Solution

Another vCloud Architecture Toolkit whitepaper that I authored was published on the vCAT SP website – it discusses how to architect vCloud Availability solution in large production scenarios.

It is based on real live deployments and includes the following chapters:

 

 

 

  • Introduction
  • Use Cases
    • Disaster Recovery
    • Migration
  • vCloud Availability Architecture Design Overview
    • vCloud Availability Architecture
    • Network Flows
    • Conceptual Architecture
  • vCloud Availability Management Components
    • Logical Architecture
    • vCloud Availability Portal
    • Cloud Proxy
    • RabbitMQ
    • Cassandra Database
    • VMware Platform Services Controller
    • vSphere Replication Cloud Service
    • vSphere Replication Manager
    • vSphere Replication Servers
    • ESXi Hosts
    • vCloud Availability Metering
    • vRealize Orchestrator
    • Management Component Resiliency Considerations
  • vCloud Director Configuration
    • User Roles
    • Tenant Limits and Leases
    • Organization Virtual Data Center
    • Network Management
    • Storage Management
    • vApps and Virtual Machines
  • Billing
  • vRealize Orchestrator Configuration
    • On-Premises Deployment
    • In-the-Cloud Deployment
    • Provider Deployment
    • Failover Orchestration
  • Monitoring
    • Component Monitoring
    • VM Replication Monitoring
    • Backup Strategy
  • Appendix A – Port Requirements / Firewall Rules
  • Appendix B – Glossary
  • Appendix C – Maximums
  • Appendix D – Reference Documents
  • Appendix E – Tenant API Structure
  • Appendix F – Undocumented HybridSettings vCloud API
  • Appendix G – Monitoring

Download from the vCAT-SP website: https://www.vmware.com/solutions/cloud-computing/vcat-sp.html or direct link to pdf.

vCloud Director 8.20: Orchestrated Upgrade

vCloud Director architecture consist of multiple cells that share common database. The upgrade process involves shutting down services on all cells, upgrading them, upgrading the database and starting the cells. In large environments where there are three or more cells this can be quite labor intensive.

vCloud Director 8.20 brings new feature – an orchestrated upgrade. All cells and vCloud database can be upgraded with a single command from the primary cell VM. This brings two advantages. Simplicity – it is no longer needed to login to each cell VM, upload binaries and execute upgrade process manually. Availability – downtime during the upgrade maintenance window is reduced.

Prerequisites

Set up ssh private key login from the primary cell to all other cells in the vCloud Director instance for user vcloud.

  1. On the primary cell generate private/public key (with no passphrase):

    ssh-keygen -t rsa -f $VCLOUD_HOME/etc/id_rsa
    chown vcloud:vcloud $VCLOUD_HOME/etc/id_rsa
    chmod 600 /opt/vmware/vcloud-director/etc/id_rsa
     

  2. Copy public key to each additional cell in the instance to authorized_keys file. This can be done with one line command ran from the primary cell or with this ssh-copy-id. Use IP/FQDN it is registered with in VCD

    cat $VCLOUD_HOME/etc/id_rsa.pub | ssh root@<cell-IP> “mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys” 

  3. Verify that login with private key works for each secondary cell in the environment

    sudo -u vcloud ssh -i $VCLOUD_HOME/etc/id_rsa root@<cell IP/FQDN>

Multi-cell Installation

Upload vCloud Director binary to the primary cell and make it executable. Execute the file with private-key-path option pointing to the private key.

/root/vmware-vcloud-director-distribution-8.20.0-5070903.bin –private-key-path $VCLOUD_HOME/etc/id_rsa

 

Optionally a maintenance cell can be specified with –maintenance-cell option.

For troubleshooting, the upgrade log is located on the primary cell in  $VCLOUD_HOME/logs/upgrade-<date and time>.log

For no-prompt execution you can add –unattended-upgrade option.

Workflow

This is the workflow that is automatically executed:

  1. Quiesce, shutdown and upgrade of the primary cell. Does not start the cell.
  2. If maintenance cell was specified, it is put into maintenance mode.
  3. Quiescing and shut down of all the other cells.
  4. Upgrade of the vCloud Database (a prompt for backup)
  5. Upgrade and start of all other cells (except the maintenance cell)
  6. If maintenance cell was specified, it is upgraded and started.
  7. Start of the primary cell

What is the difference between a quiesced cell and a cell in the maintenance mode?

Quiesced cell:

  • finishes existing long running operations
  • answers to new requests and queues them
  • does not dequeue any operations (they will stay in the queue)
  • VC lister keeps running
  • Console proxy keeps running

Cell in maintenance mode

  • waits for finish of long running but fails all queued operations
  • answer to most requests with HTTP Error code 504 (unavailable)
  • still issues auth token for /api/sessions login requests
  • No VC listener
  • No Console proxy

Interoperability with vCloud Availability

vCloud Availability uses Cloud Proxies to terminate replication tunnels from the internet. Cloud Proxies are essentially stripped down vCloud Director cells and are therefore treated as regular cells during the orchestrated upgrade.

Quiesced Cloud Proxy has no impact on replication operations and traffic. Cloud Proxy in the maintenance mode still preserves existing replications however new replications cannot be established.

2/27/2017: Multiple edits based on feedback from engineering. Thank you Matthew Frost!

Monitoring vSphere Replication RPO Compliance

Just a quick post to show how you can monitor Recovery Point Objective (RPO) compliance of a virtual machines protected with vSphere Replication.

Option 1: vCenter Server Alarm

When vSphere Replication Appliance is registered to vCenter Server multiple new vSphere Replication Event Types become available and can be used for creation of custom alarms.

List of all these event types can be queried with the following one-line PowerCLI command:

(get-view eventManager).get_Description()| select -expand Eventinfo |where FullFormat -like “*Hms*”

The following example will show how to set alarm for event “RPO violated”

Key:ExtendedEvent
Description: RPO violated
Category: error
FullFormat: com.vmware.vcHms.rpoViolatedEvent|Virtual machine vSphere Replication RPO is violated by [data.currentRpoViolation] minute(s)

  1. In vCenter Server go to Manager, Alarm Definitions and add new alarm
  2. Set alarm name, monitor VMs and specific events.
    new-alarm
  3. Enter the trigger (com.vmware.vcHms.rpoViolatedEvent)
    alarm-trigger
  4. Add Alarm actions (email, SNMP trap, run command etc.) as necessary.

Triggered alarm:

triggered-alarm

Note that this alarm applies only to VMs replicated from the particular vCenter Server. So it will not be triggered on VMs replicated to this vCenter Server.

Option 2: vCloud API

This options applies only for VM replications to or from a cloud provider who uses vCloud Availability add-on. The vCloud Director tenant APIs are extended with replication APIs. The state of each replication can be retrieved with:

GET /api/vr/replications/<replication-id>

and

GET /api/vr/failbackreplications/<replication-id>

Where list of all replications and their replication-ids is retrieved at org level with these two API calls:

GET /api/org/<org-id>/replications

and

GET /api/org/<org-id>/failbackreplications

An example of VM1 replication state (RPO 15 mins, not active with 16 min RPO violation):

replication-api

The following tables describes all the elements of the API response:

replication-details