Right now there's a significant discrepancy between the actual Patchdemo wiki envs running on the cluster and the corresponding schemas in the DB shared by Patchdemo envs:
$ kubectl -n cat-env get po | grep mediawiki | grep ^wiki | wc -l 163 $ kubectl -n cat-env exec -it envdb-mariadbop-0 -- bash -c $'mariadb -uroot -p$(echo $MARIADB_ROOT_PASSWORD) -NBe "select count(*) from information_schema.schemata where schema_name regexp \'^wiki[-_][^_]+(__main)?$\';"' 271
All 163 running envs are accounted for and their schemas exist in the shared DB. The 271−163=108 difference consists exclusively of deleted envs whose schemas were not deleted. A cursory check showed envs that are months old, so these have been probably accumulating for some time now.
Full list of (normalized) schemas can be seen here: P90359.
Note only Patchdemo envs use the shared DB at the moment.
Unless we want to commit to creating some proper distributed system with eventual consistency, then we can't guarantee that we will always be able to clean up an environment successfully (think for example of situations like the incidents back in February this year). Because of this, I propose that we create a periodic job in Catalyst that checks the data consistency in the shared DB and cleans up old schemas.
In the future we should also monitor the output from that cron job. If such leftover schemas keep happening and there's no system-wide incident to track it back to, we should try to figure out whether there's also a bug somewhere causing the issue.