the puppetdb table has almost doubled in the last ~10 months. We should investigate if this is expected or if we need to do some maintances. As part of this task it would also be useful to create some grafana metrics so we can track and plan for the growth of the puppetdb database
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | SLyngshede-WMF | T263578 puppetdb seems to be slow on host reimage | |||
Resolved | jbond | T287673 puppetdb Investigate the expected bahaviour of the edges table |
Event Timeline
Seems that th standard vacuum preformed by the autovacuum daemon dosn't reclaim space. Although I'm not sure how this is presented in postgresql
The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse. However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained.
Further it may be worth preforming a full vacuum due to the heavy delete update behaviour on this table
Tip: Plain VACUUM may not be satisfactory when a table contains large numbers of dead row versions as a result of massive update or delete activity. If you have such a table and you need to reclaim the excess disk space it occupies, you will need to use VACUUM FULL, or alternatively CLUSTER or one of the table-rewriting variants of ALTER TABLE. These commands rewrite an entire new copy of the table and build new indexes for it. All these options require exclusive lock. Note that they also temporarily use extra disk space approximately equal to the size of the table, since the old copies of the table and indexes can't be released until the new ones are complete.
https://www.postgresql.org/docs/9.2/routine-vacuuming.html
This would require downtime and likely more disk space
i have done some digging on the on disk size vs the database size which i think shows how much data we could potentially reclaim by doing a full vacume. however please be warned at this point most of what i know of postgresql i learnt over the last two days :). See below for full results but i think the following are worth pointing out
relation | disk size | size |
public.idx_factsets_jsonb_merged | 9095 MB | 9097 MB |
public.catalog_resources | 2154 MB | 5301 MB |
public.edges | 2135 MB | 7931 MB |
- idx_factsets_jsonb_merged: from the name this sounds like its and index and i suspect it is not very good for performance if the index cant fit in ram (psql only has 8G effective_cache_size)
- catalog_resources and edges. both of theses have a large discrepancy between the on-disk size and the db size. if i read this right it means we could reclaim ~9G by doing a full vacum on these two tables
- resource_params, fact_paths and resource_events also seem to have similar discrepancies
SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "disk size", pg_size_pretty( pg_total_relation_size(nspname || '.' || relname)) AS "size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname IN ('public') ORDER BY pg_relation_size(C.oid) DESC;
relation | disk size | size |
public.idx_factsets_jsonb_merged | 9095 MB | 9097 MB |
public.edges_certname_source_target_type_unique_key | 5794 MB | 5795 MB |
public.catalog_resources | 2154 MB | 5301 MB |
public.edges | 2135 MB | 7931 MB |
public.fact_paths_path_trgm | 1311 MB | 1311 MB |
public.catalog_resources_encode_idx | 922 MB | 922 MB |
public.resource_params_hash_expr_idx | 748 MB | 748 MB |
public.catalog_resources_type_title_idx | 649 MB | 649 MB |
public.catalog_resources_resource_idx | 559 MB | 559 MB |
public.resource_params | 547 MB | 2761 MB |
public.catalog_resources_type_idx | 521 MB | 521 MB |
public.resource_params_pkey | 511 MB | 511 MB |
public.catalog_resources_pkey1 | 490 MB | 491 MB |
public.idx_resources_params_name | 480 MB | 480 MB |
public.idx_resources_params_resource | 421 MB | 421 MB |
public.resource_params_cache | 305 MB | 514 MB |
public.reports | 174 MB | 281 MB |
public.resource_events_resource_timestamp | 113 MB | 113 MB |
public.resource_events_unique | 93 MB | 93 MB |
public.fact_paths | 83 MB | 1533 MB |
public.resource_params_cache_parameters_idx | 78 MB | 78 MB |
public.fact_paths_path_type_unique | 78 MB | 78 MB |
public.resource_events_resource_title_idx | 62 MB | 62 MB |
public.rpc_hash_expr_idx | 51 MB | 51 MB |
public.fact_paths_name | 32 MB | 32 MB |
public.fact_paths_pkey | 29 MB | 29 MB |
public.resource_events | 27 MB | 382 MB |
public.resource_params_cache_pkey | 25 MB | 25 MB |
public.resource_events_containing_class_idx | 19 MB | 19 MB |
public.resource_events_timestamp_idx | 17 MB | 17 MB |
public.resource_events_reports_id_idx | 17 MB | 17 MB |
public.resource_events_property_idx | 17 MB | 17 MB |
public.resource_events_status_idx | 17 MB | 17 MB |
public.reports_hash_expr_idx | 11 MB | 11 MB |
public.reports_tx_uuid_expr_idx | 9608 kB | 9608 kB |
public.idx_reports_compound_id | 9056 kB | 9080 kB |
public.idx_reports_producer_timestamp_by_hour_certname | 7472 kB | 7496 kB |
public.reports_certname_idx | 6808 kB | 6832 kB |
public.reports_catalog_uuid_idx | 4808 kB | 4832 kB |
public.catalog_resources_exported_idx | 4552 kB | 4576 kB |
public.reports_status_id_idx | 3456 kB | 3480 kB |
public.idx_reports_prod | 3432 kB | 3456 kB |
public.reports_environment_id_idx | 3384 kB | 3408 kB |
public.idx_reports_producer_timestamp | 2720 kB | 2744 kB |
public.reports_end_time_idx | 2720 kB | 2744 kB |
public.factsets | 2664 kB | 9146 MB |
public.reports_pkey | 2648 kB | 2672 kB |
public.catalogs | 1040 kB | 3688 kB |
public.factsets_hash_expr_idx | 584 kB | 584 kB |
public.idx_factsets_prod | 520 kB | 544 kB |
public.idx_certnames_latest_report_timestamp | 512 kB | 536 kB |
public.idx_certnames_latest_report_id | 472 kB | 496 kB |
public.idx_catalogs_env | 464 kB | 488 kB |
public.idx_catalogs_prod | 464 kB | 488 kB |
public.catalogs_hash_expr_idx | 408 kB | 408 kB |
public.idx_catalogs_producer_timestamp | 400 kB | 424 kB |
public.certnames | 392 kB | 2016 kB |
public.catalogs_tx_uuid_expr_idx | 328 kB | 328 kB |
public.certnames_transform_certname_key | 320 kB | 344 kB |
public.catalogs_certname_idx | 272 kB | 296 kB |
public.factsets_certname_idx | 264 kB | 264 kB |
public.certnames_transform_pkey | 208 kB | 208 kB |
public.catalogs_pkey | 168 kB | 168 kB |
public.factsets_pkey | 160 kB | 160 kB |
public.producers_pkey | 16 kB | 16 kB |
public.idx_reports_noop_pending | 16 kB | 16 kB |
public.producers_name_key | 16 kB | 16 kB |
public.report_statuses_status_key | 16 kB | 16 kB |
public.schema_migrations_pkey | 16 kB | 16 kB |
public.report_statuses_pkey | 16 kB | 16 kB |
public.value_types_pkey | 16 kB | 16 kB |
public.reports_noop_idx | 16 kB | 16 kB |
public.packages_name_trgm | 16 kB | 16 kB |
public.environments_pkey | 16 kB | 16 kB |
public.environments_name_key | 16 kB | 16 kB |
public.package_hash_key | 8192 bytes | 8192 bytes |
public.certname_package_reverse_idx | 8192 bytes | 8192 bytes |
public.packages_name_idx | 8192 bytes | 8192 bytes |
public.fact_paths_id_seq | 8192 bytes | 8192 bytes |
public.report_statuses_id_seq | 8192 bytes | 8192 bytes |
public.report_statuses | 8192 bytes | 48 kB |
public.value_types | 8192 bytes | 24 kB |
public.catalogs_transform_id_seq1 | 8192 bytes | 8192 bytes |
public.reports_cached_catalog_status_on_fail | 8192 bytes | 8192 bytes |
public.reports_job_id_idx | 8192 bytes | 8192 bytes |
public.catalogs_job_id_idx | 8192 bytes | 8192 bytes |
public.certname_id_seq | 8192 bytes | 8192 bytes |
public.fact_values_id_seq | 8192 bytes | 8192 bytes |
public.environments | 8192 bytes | 48 kB |
public.schema_migrations | 8192 bytes | 24 kB |
public.resource_events_status_for_corrective_change_idx | 8192 bytes | 8192 bytes |
public.catalogs_id_seq | 8192 bytes | 8192 bytes |
public.environments_id_seq | 8192 bytes | 8192 bytes |
public.reports_id_seq | 8192 bytes | 8192 bytes |
public.producers_id_seq | 8192 bytes | 8192 bytes |
public.factsets_id_seq | 8192 bytes | 8192 bytes |
public.producers | 8192 bytes | 48 kB |
public.packages_pkey | 8192 bytes | 8192 bytes |
public.certname_packages_pkey | 8192 bytes | 8192 bytes |
public.package_id_seq | 8192 bytes | 8192 bytes |
public.certname_packages | 0 bytes | 16 kB |
public.packages | 0 bytes | 48 kB |