During the last weekend my vCloud Director lab died. And the reason was PostgreSQL DB filled up all the disk space. How could that happen in my small lab with one running vApp?
PostgreSQL database when updating rows actually creates new ones and does not immediately delete the (now dead) old rows. That is done in a separate process called vacuuming.
vCloud Director has one pretty busy table named activity_parameters that is continuously updated. And as you can see from the below screenshot (as reported by pgAdmin table statistics) the table size is 26 MB but it is actually taking 24 GB of hard disk space due to the dead rows.
Another quick way to check DB size via psql CLI is:
\c vcloud
SELECT pg_size_pretty (pg_total_relation_size(‘activity_parameters’));
Vacuuming takes times and therefore it can be tuned in postgresql.conf via a few parameters which VMware documents specifically for vCloud Director here or here. Make sure you apply them (I did not). Another issue that could prevent vacuuming to happen is a stale long running transaction on the table.
The fix:
- short term: add more disk space
- long term: make sure postgresql.conf is properly configured
autovacuum = on
track_counts = on
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_cost_limit = 2400 - manually vacuum the activity_parameters table with the following psql CLI command:
VACUUM VERBOSE ANALYSE activity_parameters;
And do not forget to monitor free disk space on your PostgreSQL host.
hello
With release of 9.7 with embedded DB do we have any REST API available to monitor postgres DB and disk space. VCSA appliance provide some API to do some monitoring.
https://server/rest/appliance/health/system
There is currently only simple appliance UI to perform failover and no API. Improvements are planned for the next VCD release.
hello,
Thank you for the response. To increase space on vCloud director 9.7 appliance do we need to follow similar steps described for vCSA 6.5/6.7 (https://kb.vmware.com/s/article/2145603) or we need to use linux based LVM commands
Regards
Unfortunately, with 9.7 you need to use LVM commands. Please open support ticket to get guidance. Public KB is in the works.
I have a similar problem but the database does not contain activity_parameters table(vmware appliance VCD 10.1.1). The base contains only activity table with 32 partitions.
Last vacuum and Last autovacuum is empty. I make VACUUM FULL activity by manual but why is not reclaiming storage automatically performed ?
autovacuum = on
autovacuum_max_workers = 3 #restart
autovacuum_naptime = 1min
autovacuum_vacuum_cost_limit = 2400
Please open support ticket to investigate.
Thanks Tomas! This came in handy.