vCloud Director architecture consist of multiple cells that share common database. The upgrade process involves shutting down services on all cells, upgrading them, upgrading the database and starting the cells. In large environments where there are three or more cells this can be quite labor intensive.
vCloud Director 8.20 brings new feature – an orchestrated upgrade. All cells and vCloud database can be upgraded with a single command from the primary cell VM. This brings two advantages. Simplicity – it is no longer needed to login to each cell VM, upload binaries and execute upgrade process manually. Availability – downtime during the upgrade maintenance window is reduced.
Prerequisites
Set up ssh private key login from the primary cell to all other cells in the vCloud Director instance for user vcloud.
- On the primary cell generate private/public key (with no passphrase):
ssh-keygen -t rsa -f $VCLOUD_HOME/etc/id_rsa
chown vcloud:vcloud $VCLOUD_HOME/etc/id_rsa
chmod 600 /opt/vmware/vcloud-director/etc/id_rsa - Copy public key to each additional cell in the instance to authorized_keys file. This can be done with one line command ran from the primary cell or with this ssh-copy-id. Use IP/FQDN it is registered with in VCD
cat $VCLOUD_HOME/etc/id_rsa.pub | ssh root@<cell-IP> “mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys”
- Verify that login with private key works for each secondary cell in the environment
sudo -u vcloud ssh -i $VCLOUD_HOME/etc/id_rsa root@<cell IP/FQDN>
Multi-cell Installation
Upload vCloud Director binary to the primary cell and make it executable. Execute the file with ––private-key-path option pointing to the private key.
/root/vmware-vcloud-director-distribution-8.20.0-5070903.bin –private-key-path $VCLOUD_HOME/etc/id_rsa
Optionally a maintenance cell can be specified with –maintenance-cell option.
For troubleshooting, the upgrade log is located on the primary cell in $VCLOUD_HOME/logs/upgrade-<date and time>.log
For no-prompt execution you can add –unattended-upgrade option.
Workflow
This is the workflow that is automatically executed:
- Quiesce, shutdown and upgrade of the primary cell. Does not start the cell.
- If maintenance cell was specified, it is put into maintenance mode.
- Quiescing and shut down of all the other cells.
- Upgrade of the vCloud Database (a prompt for backup)
- Upgrade and start of all other cells (except the maintenance cell)
- If maintenance cell was specified, it is upgraded and started.
- Start of the primary cell
What is the difference between a quiesced cell and a cell in the maintenance mode?
Quiesced cell:
- finishes existing long running operations
- answers to new requests and queues them
- does not dequeue any operations (they will stay in the queue)
- VC lister keeps running
- Console proxy keeps running
Cell in maintenance mode
- waits for finish of long running but fails all queued operations
- answer to most requests with HTTP Error code 504 (unavailable)
- still issues auth token for /api/sessions login requests
- No VC listener
- No Console proxy
Interoperability with vCloud Availability
vCloud Availability uses Cloud Proxies to terminate replication tunnels from the internet. Cloud Proxies are essentially stripped down vCloud Director cells and are therefore treated as regular cells during the orchestrated upgrade.
Quiesced Cloud Proxy has no impact on replication operations and traffic. Cloud Proxy in the maintenance mode still preserves existing replications however new replications cannot be established.
2/27/2017: Multiple edits based on feedback from engineering. Thank you Matthew Frost!
Hi Tom,
great article!
Might you please also describe the main purpose of the –maintenance-cell parameter?
Is it about the 503 response, faster Rollback or something different?
Best Regards,
Markus
The main reason is that it provides the 503 response. Additionally the cell in this state can be still used for auth token generation for authentication (this might be useful for vCloud Availability Cloud Proxy cells).
Thanks Tomas!
Thank you. I was struggling to get the proper process for getting the ssh keys where they needed to be. This was helpful.