= Issues identified =
* {T218463}
* {T218824}
* [Importing](https://phabricator.wikimedia.org/T211239) revisions from other wikis can add arbitrary amounts of history at arbitrary points in the past
* Revision deletion of user names results in null user information in mediawiki_history (documented [on Wikitech](https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData_Lake%2FEdits%2FMediawiki_history&type=revision&diff=1820548&oldid=1819453) and in T212172)
* {T220456}, which led to the inconsistent exclusion of 119 small wikis.
* The MediaWiki replica pipeline counted only wikis in a [specific set of sitegroups](https://github.com/neilpquinn/wmfdata/blob/b0548529c4d39fc37f40fb637025e8a9b428a33f/wmfdata/mariadb.py#L94), whereas the Data Lake pipeline counted all the wiki present in the mediawiki_history data, which led to the inconsistent inclusion of 34 small wikis.