In the guidance instrumentation work, we introduced a new schema: NewcomerTask. In this task, @nettrom_WMF will go over the reporting notebook we use for newcomer tasks to make any adjustments needed to account for the new schema and any changes to legacy schemas that have happened for guidance.
The notebook will be sent separately.
These are additional requests for while we are working on the notebook:
- Each time we deploy our features in a new wiki, the new wiki's name (e.g. "plwiki") has to be added in about 15 different places in the notebook. Could we turn that string of wikis into a variable that only has to be adjusted once?
- We need to report on revert rates each week to make sure that as we introduce new features, we are not increasing reverts. Perhaps an easy way to do this is to modify the get_datalake_tagged_edits_by_week to have a column that counts reverted suggested edits. Would it be simple to have this since the beginning of time? Or only since the completion of T164307: Add Reverted filter to RecentChanges Filters?
- For that same get_datalake_tagged_edits_by_week function, it would be valuable to split out the various columns by whether the editor registered during the week of that row, or whether they are an editor who registered in a preceding week (same for edits -- whether the edit/revert was made by an editor who registered that week or previously). This will help us understand how much of the suggested edits activity is coming from "new" newcomers and how much from "retained" newcomers.