So far, metadata for backups is stored on the zarcillo database indefinitely. Maintaining, e.g. one year of backlog of backups is doable (we will only have around 50 * number of sections * number of datacenters). However, backups are sent to long term storage (bacula) after a week, and removed from the local storage after 3 weeks. Those are not really needed, and may slow down the operations a lot if a lot of data is stored (depending on the analytics needed).
Consider purging or setting up a "deleted" state for older backups, either on purge ran (rotation), or out of band with a cron job.
See also if something about the metadata lifecycle has to be improved somehow (e.g. tracking while on bacula storage only, etc.)
Edit: Additionally, monitor long-running ongoing backups and mark them as failed (for example, after 24 hours), and/or alert on those.