If mediawiki database schemas change, we risk having problems in our reconstruction and other datasets. We need to have this kind of interrupt in our process:
- get and compare mediawiki schemas on a daily or weekly frequency
- if there are any changes, raise an alert for a human to check
- if no human acknowledges the alert and dismisses it, halt all processing of mediawiki data
This would allow us to react to changes that we missed through other channels like wikitech. It would definitely come too late and may cause us to miss publishing certain datasets, but may save us more headaches in the long run. Ideally, the alarms would be tuned to only alert us about parts of the schema we care about, so perhaps they'd be filtered down to a set of tables and/or columns that we use.