During the last weekend my vCloud Director lab died. And the reason was PostgreSQL DB filled up all the disk space. How could that happen in my small lab with one running vApp?
PostgreSQL database when updating rows actually creates new ones and does not immediately delete the (now dead) old rows. That is done in a separate process called vacuuming.
vCloud Director has one pretty busy table named activity_parameters that is continuously updated. And as you can see from the below screenshot (as reported by pgAdmin table statistics) the table size is 26 MB but it is actually taking 24 GB of hard disk space due to the dead rows.
Another quick way to check DB size via psql CLI is:
SELECT pg_size_pretty (pg_total_relation_size(‘activity_parameters’));
Vacuuming takes times and therefore it can be tuned in postgresql.conf via a few parameters which VMware documents specifically for vCloud Director here or here. Make sure you apply them (I did not). Another issue that could prevent vacuuming to happen is a stale long running transaction on the table.
- short term: add more disk space
- long term: make sure postgresql.conf is properly configured
autovacuum = on
track_counts = on
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_cost_limit = 2400
- manually vacuum the activity_parameters table with the following psql CLI command:
VACUUM VERBOSE ANALYSE activity_parameters;
And do not forget to monitor free disk space on your PostgreSQL host.