Develop:
[] A working definition for "moderation activity" that meets [[ https://meta.wikimedia.org/wiki/Research_and_Decision_Science/Data_glossary#Essential_metrics | essential metric criteria ]]
[] A working definition for "moderator" based on the "moderation activity" that meets essential metric criteria
**Scope for this task**
- Only consider Wikipedia moderation activity.
- It is ideal that the definition works for all Wikipedia languages. However, if you run into challenges that makes scaling to all languages in one quarter hard, it is okay to scale down. Follow the escalation process below.
- Output will be based on processing offline data (dumps) and will not be in production at delivery time (Decision made on October 9th)
- Main namespace only (per November 22nd update by Diego and a slack thread between Diego, Marshall and Leila)
**Output**
If a change in expectation with regards to outputs is needed, @diego can discuss with Leila and Marshall. As the team works on the project, the team may want to propose other outputs better than what we imagined below. This is very much welcomed. Diego will let Marshall and Leila know about such proposals.
[] **Report**. A Research page on meta-wiki that contains important information about the project and routes to other important project assets (spreadsheets, presentations, Phabricator ticket, etc.). Note that the research page on meta may need to be further distilled to turn into a "report" for the product use-cases (via a subpage, e.g.). If the report contains sensitive information that cannot be shared publicly, a Google Document can be used instead.
[] **Data**:
-- [] classified revision IDs (examples). Ideally: A research tool where given a revision ID can output the classification of relevant edit types. If this is not possible: at least a spreadsheet that lists 100s of diffs and their classifications, spanning different kinds of diffs and users.
-- [] Aggregates. A spreadsheet that we can pivot to count number of moderators, number of each type of moderation activity, ... against some dimensions. Examples of dimensions: editor tenure, editors with extended rights, edit count, language, year, ... Additionally, it would be good to be able to pivot at the user level to be able to pull data such as "for a random sample of 1000 users, how many of each edit types have the users done over a span of time"
At the delivery time, and for the languages that you can offer the working definitions, we should be able to know:
- if an edit is a moderation related edit and if so what kind of moderation it is.
- who are the moderators using the previous information.
**How to escalate**
If you are blocked, please escalate to @leila and keep Kenyatta (our program manager) in the loop. If you need additional resources, same. If you need to further escalate, please escalate to Kate Zimmerman. Please don't be shy.
**Stakeholder**
There are multiple groups and people who will benefit from this work and who can/should be consulted for this work. The Stakeholder whose input is needed for scoping and will need to sign off the work at the end of the quarter is @MMiller_WMF who has asked for this work to advance WE 1.3 (in this FY and in the future ones).
**Confirmed list of direct contributors**
This list can expand as the project progresses and more needs are identified.
@cwylo, @Isaac, @Pablo,
Research Engineering: the support is cleared by @XiaoXiao-WMF. As soon as you know more specifics about your needs and timelines, @diego please request directly of Xiao and she will assign one of Fabian or Muniza.
**Confirmed list of folks available for consulting**
@YLiou_WMF: for sampling (if more survey is needed), creating connections between T370439 and possible survey needs for this task, sharing learnings from T368791
@Easikingarmager: for sharing learnings from T368791 and drawing connections between the two projects as relevant
@KCVelaga_WMF: for supporting Isaac on the essential metric piece.
@Samwalton9-WMF: for helping us to produce outputs relevant for product
@OTichonova: supporting on the deliverables