vCloud Director 8.20: Orchestrated Upgrade

vCloud Director architecture consist of multiple cells that share common database. The upgrade process involves shutting down services on all cells, upgrading them, upgrading the database and starting the cells. In large environments where there are three or more cells this can be quite labor intensive.

vCloud Director 8.20 brings new feature – an orchestrated upgrade. All cells and vCloud database can be upgraded with a single command from the primary cell VM. This brings two advantages. Simplicity – it is no longer needed to login to each cell VM, upload binaries and execute upgrade process manually. Availability – downtime during the upgrade maintenance window is reduced.

Prerequisites

Set up ssh private key login from the primary cell to all other cells in the vCloud Director instance for user vcloud.

  1. On the primary cell generate private/public key (with no passphrase):

    ssh-keygen -t rsa -f $VCLOUD_HOME/etc/id_rsa
    chown vcloud:vcloud $VCLOUD_HOME/etc/id_rsa
    chmod 600 /opt/vmware/vcloud-director/etc/id_rsa
     

  2. Copy public key to each additional cell in the instance to authorized_keys file. This can be done with one line command ran from the primary cell or with this ssh-copy-id. Use IP/FQDN it is registered with in VCD

    cat $VCLOUD_HOME/etc/id_rsa.pub | ssh root@<cell-IP> “mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys” 

  3. Verify that login with private key works for each secondary cell in the environment

    sudo -u vcloud ssh -i $VCLOUD_HOME/etc/id_rsa root@<cell IP/FQDN>

Multi-cell Installation

Upload vCloud Director binary to the primary cell and make it executable. Execute the file with private-key-path option pointing to the private key.

/root/vmware-vcloud-director-distribution-8.20.0-5070903.bin –private-key-path $VCLOUD_HOME/etc/id_rsa

 

Optionally a maintenance cell can be specified with –maintenance-cell option.

For troubleshooting, the upgrade log is located on the primary cell in  $VCLOUD_HOME/logs/upgrade-<date and time>.log

For no-prompt execution you can add –unattended-upgrade option.

Workflow

This is the workflow that is automatically executed:

  1. Quiesce, shutdown and upgrade of the primary cell. Does not start the cell.
  2. If maintenance cell was specified, it is put into maintenance mode.
  3. Quiescing and shut down of all the other cells.
  4. Upgrade of the vCloud Database (a prompt for backup)
  5. Upgrade and start of all other cells (except the maintenance cell)
  6. If maintenance cell was specified, it is upgraded and started.
  7. Start of the primary cell

What is the difference between a quiesced cell and a cell in the maintenance mode?

Quiesced cell:

  • finishes existing long running operations
  • answers to new requests and queues them
  • does not dequeue any operations (they will stay in the queue)
  • VC lister keeps running
  • Console proxy keeps running

Cell in maintenance mode

  • waits for finish of long running but fails all queued operations
  • answer to most requests with HTTP Error code 504 (unavailable)
  • still issues auth token for /api/sessions login requests
  • No VC listener
  • No Console proxy

Interoperability with vCloud Availability

vCloud Availability uses Cloud Proxies to terminate replication tunnels from the internet. Cloud Proxies are essentially stripped down vCloud Director cells and are therefore treated as regular cells during the orchestrated upgrade.

Quiesced Cloud Proxy has no impact on replication operations and traffic. Cloud Proxy in the maintenance mode still preserves existing replications however new replications cannot be established.

2/27/2017: Multiple edits based on feedback from engineering. Thank you Matthew Frost!

Collect vCloud Director Cell Logs with Log Insight Agent

vcenter-log-insight-logoWhile it is possible to redirect vCloud Director cell logs by editing log4j.properties file to remote syslog server (see KB 2004564) there is an alternative agent based method utilizing vRealize Log Insight.

Log Insight agent is installed on each cell and then remotely managed from Log Insight server. Here are some advantages of this approach:

  • no manual edits of log4j file which gets overwritten with each upgrade
  • as we do not rely log4j logger we are able to collect also API request log files which are generated by Jetty
  • agent uses reliable TCP communication as opposed to unreliable UDP
  • we no longer rely on source IP to identify sender; cells can use source NAT (with single IP) to communicate with Log Insight server and we can still distinguish them
  • we can remotely change which logs we want to monitor (info vs debug)
  • and much more

Here is quick configuration how to:

  1. Download Log Insight Agent from Log Insight Server. It is already customized installation for your vRLI server. Administration > Agents > scroll down > Download Log Insight Agent Version 3.6.0 > pick rpm package
  2. Upload rpm file to each cell and install it with rpm -i VMware-Log-Insight-Agent-3.6.0-4148343.noarch_XXX.rpm
  3. Back in Agents configuration create active agent group from vCloud Director Cell Server template (copy template icon)
  4. Create hostname filter (use ? for any character substitution, you can add multiple entries in one line for ‘logical or’ or multiple lines for ‘logical and’
  5. Optionally edit agent configuration to include additional files or directories

agent-config

Edit 11/30/2017:
Example of Agent Configuration based on vCloud Director 9.0

[filelog|vcd-essential]
directory=/opt/vmware/vcloud-director/logs
include=vcloud-container-debug*;upgrade*;vmware-vcd-support*;watchdog*;console-proxy*;statsfeeder*;server-group-communication*;cell-runtime*
event_marker=(\d{2}|\d{4})-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s
tags={"vmw_product":"vcd"}

[filelog|vcd-API]
directory=/opt/vmware/vcloud-director/logs
include=*request.log*
event_marker=\b(?:\d{1,3}\.){3}\d{1,3}\b
tags={"vmw_product":"vcd"}

Edit 2/2/2021
Example for VMware Cloud Director 10.2 appliance

[filelog|vcd]
directory=/opt/vmware/vcloud-director/logs
include=vcloud-container-debug*;upgrade*;vmware-vcd-support*;watchdog*;vcloud-container-info*;cell*;request.log*;cell-management-tool.log*;cell-runtime.log*;cell.log*;cloud-proxy.log*;queries*;networking.log*;server-group-communications.log*;upgrade-*;service-wiring.log*;statsfeeder.log*;networking-wire.log*
event_marker=(\d{2}|\d{4})-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s
tags={"vmw_product":"vcd"}

[filelog|vcd-API]
directory=/opt/vmware/vcloud-director/logs
include=*request.log*
event_marker=\b(?:\d{1,3}\.){3}\d{1,3}\b
tags={"vmw_product":"vcd"}

[filelog|vcd-appliance]
directory=/opt/vmware/var/log/vcd
include=*.log*
tags={"vmw_product":"vcd-appliance"}

[filelog|vcd-vpostgres]
directory=/var/vmware/vpostgres/10/pgdata/log
include=*.log*
tags={"vmw_product":"vcd-vpostgres"}
[filelog|vcd]
directory=/opt/vmware/vcloud-director/logs
include=vcloud-container-debug*;upgrade*;vmware-vcd-support*;watchdog*;vcloud-container-info*;cell*;request.log*;cell-management-tool.log*;cell-runtime.log*;cell.log*;cloud-proxy.log*;queries*;networking.log*;server-group-communications.log*;upgrade-*;service-wiring.log*;statsfeeder.log*;networking-wire.log*;vclistener.log*;activityevent.log*;jms-debug.log*
event_marker=(\d{2}|\d{4})-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s
tags={"vmw_product":"vcd"}

[filelog|vcd-API]
directory=/opt/vmware/vcloud-director/logs
include=*request.log*
event_marker=\b(?:\d{1,3}\.){3}\d{1,3}\b
tags={"vmw_product":"vcd"}

[filelog|vcd-appliance]
directory=/opt/vmware/var/log/vcd
include=*.log*
tags={"vmw_product":"vcd-appliance"}

[filelog|vcd-vpostgres]
directory=/var/vmware/vpostgres/current/pgdata/log
include=*.log*
tags={"vmw_product":"vcd-vpostgres"}

vCloud Director: Share Console Proxy IP with UI/API IP Address

New vCloud DIrector 8.10 (read eight dot ten) is out and with it some little neat features. Let me quickly talk about one of them – the ability to run vCloud Director cell with just 1 IP address.

In the past you always had to configure vCloud Director cell at least with two IP addresses. One for the web interface (providing UI and API) and the other for remote console proxy. The reason was that both services shared the same port 443. In vCloud Director 8.10 there is possibility to specify ports for each service and thus use just one IP address. This helps if your DMZ subnet is too small and you need to deploy more VMs into that network (more cells, databases, etc.).

Note that the configure script will not ask you for ports, instead you need to use unattended installation option or add port entries afterward in global.config file.

Unattended Installation

Here is the example of configure parameters that sets console proxy to the same IP address as http (10.0.1.60) and uses port 8443 instead of the standard 443:


/opt/vmware/vcloud-director/bin/configure" -cons 10.0.1.60 --console-proxy-port-https 8443 -ip 10.0.1.60 --primary-port-http 80 –-primary-port-https 443 -dbhost 10.0.4.195 -dbport 1433 -dbtype sqlserver -dbinstance MSSQLSERVER -dbname vcloud -dbuser vcloud -dbpassword 'VMware1!' -k /opt/vmware/vcloud-director/etc/certificates.ks -w 'passwd' -loghost 10.0.4.211 -logport 514 -g --enable-ceip true -unattended

Global Properties

An alternative option is to edit the /opt/vmware/vcloud-director/etc/global.properties file and add new port entries:

Before:


...
product.version = 8.10.0.3879706
product.build_date = 2016-05-12T20:32:07-0700
vcloud.cell.ip.primary = 10.0.1.60
consoleproxy.host.https = 10.0.1.61
...

After


...
product.version = 8.10.0.3879706
product.build_date = 2016-05-12T20:32:07-0700
vcloud.cell.ip.primary = 10.0.1.60
consoleproxy.host.https = 10.0.1.60
consoleproxy.port.https = 8443
vcloud.http.port.standard = 80
vcloud.http.port.ssl = 443
...

Do not forget to reconfigure your loadbalancer remote console pool to point to the new IP-port combination.

Unattended Installation of vCloud Director

In vCloud Director 8.0 many enhancements were made to enable unattended installation. This is useful to eliminate manual steps to speed up installation process as well as ensure identical configuration among multiple vCloud Director instances.

Let’s say the provider needs to deploy multiple vCloud Director instances each consisting of multiple cells. Here is the process in high level steps.

Preparation of base template

  • Create Linux VM with supported RHEL/CentOS distribution.
  • Upload vCloud Director binaries to the VM (e.g. vmware-vcloud-director-8.0.0-3017494.bin)
  • Execute the installation file without running the configure script

Prerequisites for each vCloud Director Instance

The following must be prepared for each vCloud Director instance <N>:

  • Create database:
    • DB name: vcloudN
    • DB user: vcloudN
    • DB password: VMware1!
  • Prepare NFS transfer share
  • Create DNS entries, load balancer and corresponding signed certificates for http and consoleproxy and save them to a keystore file certificates.ks. In my example I am using keystore password passwd.

Unattended Installation of the First Cell

  • Deploy base template and assign 2 front-end IP addresses. These must match load balancer configuration. e.g. 10.0.2.98, 10.0.2.99
  • Mount NFS transfer share to /opt/vmware/vcloud-director/data/transfer
  • Upload certificates to /opt/vmware/vcloud-director/etc/certificates.ks
  • Run configure script – notice the piping of “Yes” answer to start VCD service after the configuration:
    echo "Y" | /opt/vmware/vcloud-director/bin/configure -cons 10.0.2.98 -ip 10.0.2.99 -dbhost 10.0.4.195 -dbport 1433 -dbtype sqlserver -dbinstance MSSQLSERVER -dbname vcloudN -dbuser vcloudN -dbpassword 'VMware1!' -k /opt/vmware/vcloud-director/etc/certificates.ks -w passwd -loghost 10.0.4.211 -logport 514 -g -unattendedwhere 10.0.4.195 is IP address of my MS SQL DB server and 10.0.4.211 syslog server.
  • Store /opt/vmware/vcloud-director/etc/responses.properties file created by the initial configuration in a safe place.
  • Run initial configuration to create instance ID and system administrator credentials:
    /opt/vmware/vcloud-director/bin/cell-management-tool initial-config --email vcloudN@vmware.com --fullname Administrator --installationid N --password VMware1! --systemname vCloudN --unattended --user administrator
    where N is the installation ID.

Unattended Installation of Additional Cells

vCloud cells are stateless, all necessary information is in vCloud database. All we need is responses.properties file from the first cell that contains necessary encrypted information how to connect to the database.

  • Deploy base template and assign 2 front-end IP addresses. These must match load balancer configuration. e.g. 10.0.2.96, 10.0.2.97
  • Mount NFS transfer share to /opt/vmware/vcloud-director/data/transfer
  • Upload certificates to /opt/vmware/vcloud-director/etc/certificates.ks
  • Upload responses.properties file to /opt/vmware/vcloud-director/etc/responses.properties
  • Run configure script – notice the piping of “Yes” answer to start VCD service after the configuration:
    echo "Y" | /opt/vmware/vcloud-director/bin/configure -r /opt/vmware/vcloud-director/etc/responses.properties -cons 10.0.2.96 -ip 10.0.2.97 -k /opt/vmware/vcloud-director/etc/certificates.ks -w passwd -unattended

Additional configurations from now on can be done via vCloud API.

Edit 7/31/2016: vCloud Director 8.10 brings additional improvements for unattended installation. See here and here.

VCD Cell Management Tool without Administrator Credentials

I just learned from engineering neat trick related to how cell management tool can be invoked without specifying administrator credentials.

The issue is that currently you cannot use LDAP account to trigger cell management tool commands which are mostly used for quiescing and shutting down cells for maintenance. Using vCloud Director local administrator account is discouraged as it poses a security issue. However what is possible is to trigger the cell management tool as root (or with sudo) and supply via hidden flag -i the process ID of the java process.

Here is an example:

PID

First I query the java PID with ps aux command. Then I use the standard cell-management-tool command without specifying the user with the -i flag at the end.

So you can force the administrator to log in to the cell guest OS via a LDAP account and then run the command with sudo.

Thank you Zachary Shepherd for the tip.

Update 9/28/2016:

Georgi provided great tip in the comments. As the PID is written in var/run/vmware-vcd-cell.pid you can actually run a one-liner.

example:

/opt/vmware/vcloud-director/bin/cell-management-tool cell -i `cat /var/run/vmware-vcd-cell.pid` -t

Thanks Georgi!