vCloud Director completely abstracts underlying virtual resources from the consumers who get compute and storage resources represented by virtual datacenters (VDC) with given tiered profile (e.g. gold – silver – bronze). However the provider must care about the actual physical hardware and from time to time is facing the issues of upgrades and migrations.
Fortunately it is possible to migrate whole Provider VDCs non disruptively from the old hardware to a new one with no or minimal impact on the vCloud customers with no downtime or their VMs running in the cloud.
vCloud Director 5.1 has two features that help to accomplish this: elastic VDC (VDCs spanning multiple vSphere clusters) and merging of Provider VDCs. I already wrote about elastic VDCs in the post about Allocation Pool changes so please read that article first if that concept is new to you.
The online migration process from the old to the new hardware in high level works like this:
- Let’s say that the Provider VDC (PVDC) called GoldVDC is backed by Cluster1 consisting of old hardware.
- A new Cluster2 is created with new hardware.
- A new PVDC is created – let us call it GoldVDCnew from the Cluster2 and merged with Cluster1. Although we could add the new Cluster2 directly to the GoldVDC this would not allow us to retire the old hardware as it is not possible to detach the primary resource pool from a PVDC.
- We can now rename PVDC GoldVDCnew to GoldVDC and disable Cluster1. This has no impact on running VMs however any newly deployed or power on VMs are already placed to Cluster2.
- Now we have to migrate all the workloads from Cluster1 to Cluster2 and then detach the Cluster1 from the PVDC.
The actual migration between Clusters (or resource pools) has to take into account following 5 resources that exist in the VDCs – vApps, catalog templates, catalog images, Edge Gateways and vApp Edges.
vCloud Director actually does not use vSphere vApp objects. vCloud vApps are from the point of view of vSphere infrastructure just a collection of VMs. So we just need to migrate the VMs. This cannot be done from within vSphere because vCloud Director keeps track in which resource pool each VM is placed. Additionally vCloud Director also needs to apply proper resource pool reservations and limits based on the org VDC allocation type. There is however migrate option in vCloud Director that can be used. This can be done from GUI or with API (see the end of this article). Note: the migration leverages vMotion with shared storage. It is not possible to migrate this way between clusters without shared storage even though vSphere 5.1 has so called Enhanced vMotion (aka shared nothing vMotion).
Migration of catalog templates is more difficult. Again the vCloud template is quite different from the vSphere template. vCloud templates are basically powered off VMs. Although migration at the vSphere level seems not as harmful as in the previous case, because no resource pool settings must be configured (catalog VMs are never powerd on), we would still encounter a problem when we would try to detach Cluster1 as vCloud Director keeps track of VM to Resource Pool associations.
Unfortunately the GUI migration process from the step 1 cannot be used. The GUI workaround is to open each catalog and move each catalog VM to the same catalog. This basically creates a clone of the VM which gets registered to a Cluster2 host and the original VM is deleted. This is however very expensive operation from the storage perspective. The cloning operation needs temporarily extra space and creates quite a lot of I/O storage traffic. Fast provisioning (linked clones) can help here.
The second alternative is to use the same API call as in the first case. Although this is not documented it works (see the example at the end of the article).
ISO or FLP images are stored on vSphere datastores in special folder <VCD Instance Name>/media/<org ID>/<org VDC ID>/media-<media ID>.<ISO|FLP>. vSphere (vCenter) does not keep track of these objects in any way. vCloud Director stores the datastore moref and the media folders in its database. The media upload is done by the vCloud Director Cells with NFC – VMware Network File Copy protocol via any ESX host that is connected to the datastore. Therefore as long as the Cluster2 has access to the media datastores nothing needs to be migrated.
Gateway and vApp Edges are always running VMs placed in System vDC resource pool in Cluster1. If a vApp with routed vApp network is powered off the particular vApp Edge is destroyed. When the vApp is started again a new vApp Edge is deployed with identical configuration and would be placed into the new Cluster2. Simple vMotion between cluster seems to work at first but is definitely not recommended. vShield Manager keeps track to which cluster is each Edge deployed. Any major Edge configuration change (upgrade to new version or upgrade ot full configuration) would try to deploy the Edge to the original cluster.
Edge redeploy is an operation with minimum impact on the network flows going through the virtual router. A new Edge VM is deployed by vShield Manager, identical configuration is pushed to it and then networks are disconnected from the old Edge and connected to the new one. This might have impact on loosing an IPsec VPN connection or load balanced session otherwise the disruption is minimal. The Edge redeploy however cannot be done directly from vShield Manager (too bad as there is a nice script for this: see KB 2035939) because vShield Manager knows nothing about restrictions made in vCloud Director on the PVDC (the disabled Cluster1). Edge Gateway redeploy and routed/fenced network reset must be done from vCloud Director. This can be done from the GUI (however it is not trivial to find all the running vApp Edges) or with vCloud API.
There are some limitations or considerations that need to be taken into account:
- VDC elasticity currently (version 5.1) works only within vCenter Datacenter domain and all clusters need to use the same distributed switch for external networks and network pool.
- Reservation allocation type Org VDCs do not currently support elasticity of VDC (those workloads cannot be migrated).
- Both clusters should have access to the same storage. If storage migration is required do it as independent second step.
- vSphere vMotion restrictions apply: if the new hardware has newer generation CPU leverage EVC and lower the compatibility of the new cluster to the old hardware. Once the old hardware is retired the EVC mode can be changed and any restarted (full power cycle required) or new VMs can take advantage of it. Obviously migrations between different CPU architectures is not possible (AMD vs Intel),
- 1 GHz of old CPU is not equal to 1 GHz of a newer generation CPU. Therefore do not mix them in elastic VDCs unless for above mentioned migration reasons. This could also impact Chargeback – customer will get different (higher) performance for the same cost.
vCloud API Examples
As mentioned above vCloud VMs can and vCloud template should be migrated with an API call. The request looks like this:
POST API-URL/admin/extension/resourcePool/id/action/migrateVms with Request body containing MigrateParams.
VM migration example:
Template migration example
20 thoughts on “vCloud Director: Online Migration of Virtual Data Center”
Nice article Tomas !
I’ll probably write a script to ease those steps whenever I get some time !
That would be great Timo, thanks.
Thank you for your article…
What if I have a basic vcenter server with 4 nodes (esxi) and around 10 VMs running on it. can I implement on the same hardware vcloud director and then move the VMs into vcloud without downtime?
The import of vSphere VM into an OrgVDC by vCloud system admin requires power off of the VM.
vAPP Edge migration is easy, but how Datacenter Edge Gateway migration is working? I tried do disable the “old” datacenter and redeploy the Edge GW, but i cannot redeploy to the new datacenter…
Just found the solution: Changing the “ownership” by appling the following sql statement will work:
–change owner of edge_gateway
UPDATE gateway_logical_resource SET owner_id = (SELECT id FROM org_prov_vdc WHERE name=”)
WHERE name = ” AND owner_type=2;
–change owner of org_network
UPDATE vdc_logical_resource SET vdc_id = (SELECT id FROM org_prov_vdc
WHERE name=”) WHERE name=” AND lr_type = ‘NETWORK’ AND
vdc_id = (SELECT id FROM org_prov_vdc WHERE name=”);
Then redeploy the vDC Edge. The Edge will be redeployed in the new Datacenter, the old one will be deleted.
Edge Gateway migration should work as described in my article. Disable the old RP it sits on and redeploy.
Tom, we have an elastic PVDC setup consisting of 5 clusters (1 for NSX ESGs, and 4 for tenant vApps/VMs). We are considering a reconfiguration of this environment (same hardware) such that it consists of 5 PVDCs (1 for each cluster). Here’s the kicker. It has to be performed online and with little to no downtime for tenants. Is this even possible? Appreciate your thoughts.
vCD 8.10 (can upgrade to 8.20 if it helps achieve the goal)
Hi Tom, I have a potential scenario just like Jason. We are wanting to live-migrate workloads in the following scenario:
PvDC01 using Compute Cluster 01 for a CustomerX who has a single OrgVDC using the “allocated” model
PvDC02 using Compute Cluster 02, no shared storage between Cluster 01 and Cluster 02
CustomerX needs workload migrated live from Cluster 01 to Cluster 02.
Keen for some advise here…
As of VCD 9.1 you still need shared storage to migrate VMs between RPs from within VCD. However, you could do migrations directly in VC (VCD 9.x should be resilient against cluster changes). Obviously you need to vMotion the VM to the correct RP, storage policy, port group, etc.
Unfortunately what I expected to happen, did happen. Because I am trying to migrate workloads between pVDCs, I don’t have any way of moving them between OrgVDCs without some magic in the back end to remove them from VCD without removing them from vSphere. Maybe a future release of VCD will allow this to happen? Thanks, Dave.
Hmm ok so if there is shared storage between compute clusters that should be ok? Sounds like I need to do some testing.
Can elasticity be used between two different 6.7 vcenters that share the same PSC? I need to replace the vcenter that vcloud currently uses due to organizational changes that require us to change the FQDN of the vcenter server. Since the vCenter FQDN change is not supported, the idea was to use vcloud elasticity. Then we would attach both vcenters to the same PSC and then merge. Unfortunately, now that i have done everything up till the merge portion… when i go to merge there is no PVDC listed to merge with. I am assuming it is because elasticity must take place under the same vcenter just different clusters?
Correct, PVDC cannot span VCs. The boundary is Datacenter object in VC.
in VMWare Cloud Director 10.1, I can’t find the vapp migration under Vcenter Resource Pool in VCloud Provider Portal
Yes, it is missing. Use API as a workaround.
Hi Tomas, do you have PowerCLI example to move the workload?
I do not. I use Postman when needed.
Can you point to what APIs are used to this in VCD 10.2