With T221338 complete, the edits_hourly data (T211173) should now be ready to use. The purpose of this task is to dobule-check the data and make sure there are no major outstanding issues.
Outstanding Issues:
* content edit counts in wmf.mediawiki_history are not fully reliable. [[ https://phabricator.wikimedia.org/T221338 | T221338 ]]
* all anon users display with 10,000 edit count on edit_hourly dataset. [[https://phabricator.wikimedia.org/T224941 | T224941]]
Proposed checks:
| **check** | **status** |
| View in Turnilo and Superset. Confirm data appears as expected by applying various filters and splits.|
| Confirm data matches query results on `wmf.mediawiki_history` data and monthly contributors metrics numbers.| ✅ Confirmed. See [[ https://docs.google.com/spreadsheets/d/1E8HoesABu6KUxcbHAz-zW6MECJLanrFz1cMp66FfDm8/edit#gid=891834841 | shared doc ]]
| Confirm that the content page edit counts issue was corrected by comparing to query results on MariaDB replicas.| ✅ Confirmed
| Perform queries to confirmed revision events in mediawiki_history and in Druid have expected page info. | ✅ Confirmed.
| Confirm anon users display with correct edit_count| ✅ Confirmed. Note: Anonymous users edit user count bucket is listed as undefined at this time because that info is not available in Data lake.