Page MenuHomePhabricator

[Q4 FY 24-25 Applied Science] Knowledge Integrity Research
Closed, ResolvedPublic

Description

This is a parent task to capture the Q4 work by Applied Sciences (Research) related to Knowledge Integrity. It will capture prioritization decisions and major weekly updates related to tasks in this bucket from April - June 2025. More fine-grained updates and coordination will occur in the subtasks as appropriate. It follows the Q3 task (T383610).

Confirmed Projects

ProjectResponsiblePrioritizationTicketStatus
Moderator Tools Support@YLiou_WMFOKRT377264Complete
Moderator Motivations@TAndicEssential WorkT391499Complete
Centralizing Contributions@cwyloEssential WorkT387462Complete
Metrics on Patrolling Work@PabloOKRT392210Complete
Peacock Support@diegoOKR consultationT386645Ongoing
NPOV Workstreams@IsaacEssential WorkT393634Ongoing

Details

Due Date
Jun 30 2025, 4:00 AM

Event Timeline

Isaac triaged this task as High priority.
Isaac set Due Date to Jun 30 2025, 4:00 AM.
  • Moderator Tools Support: No major updates this week
  • Moderator Motivations: @TAndic continued lit review and existing process explorations (note: I will be OOO next week)
  • Centralizing Contributions: @cwylo continued research brief scoping discussions
  • Metrics on Patrolling Work: @Pablo initiated scoping for Q4 research
  • Peacock Support: @diego risk assessment and mitigation plans
  • Moderator Tools Support: deployment of the Patroller Tools survey has run into another unanticipated roadblock as there was no deployment train this week. We now anticipate deployment next week. (T389401)
  • Moderator Motivations: no update this week (@TAndic OOO)
  • Centralizing Contributions: We wrapped up scoping in the research brief and are now planning the discussion guide, screener, and recruitment strategy..
  • Metrics on Patrolling Work: Academic and WMF literature was reviewed. A meeting with colleagues working on the Edit Check was held to discuss common interests and potential collaborations.
  • Moderator Tools Support: Still awaiting deployment
  • Moderator Motivations: No major updates
  • Centralizing Contributions: Using the Design Research Participant Database, we've done our first round of recruitment. We just need to finish a run-through of the testing protocol on our Userlytics platform, and hope to send screeners and start scheduling sessions by end of week.
  • Metrics on Patrolling Work: The hypothesis in support of Moderation and the Centralizing Contributions project has been formulated (already posted in Asana). @Isaac J and @Pablo met to discuss details on the deliverable: a dataset of all edits for a month, including metadata on each edit and of its patrolling review process. In addition, a coordination meeting was held with the Moderators Tool team to align on recent and next steps.
  • Peacock Support: No updates this week (conference attendance)
  • Moderator Tools Support: Survey is collecting data!
  • Moderator Motivations: No major updates, focused on literature on extrinsic motivations in online communities.
  • Centralizing Contributions: We've started our first participant sessions, so data collection has officially begun! We have three other sessions scheduled so far, on track to hit our target of 6-8 total sessions by May 23.
  • Metrics on Patrolling Work: Started working on a notebook to create the dataset of edits in March 2025 with metadata, including mediawiki_history fields and patrolling information and status (prevented, delete, reverted, reviewed, edited_over, autopatrolled).
  • Peacock Support: ML-team is using annotool (developed by @MunizaA) to gather evaluation data.
  • Moderator Tools Support: No new updates this week (conference attendance).
  • Moderator Motivations: No major updates, literature write-up/synthesis draft in progress.
  • Centralizing Contributions: We're at 3 sessions complete, 3 more scheduled; if all goes well we will have hit our target session count. We've since learned that our user testing platform for this research, Userlytics, does not permit moderated sessions without the use of a webcam. We have devised a workaround with our existing suite of software for testing, but this will be useful to note for future research held on that platform.
  • Metrics on Patrolling Work: Advanced the notebook to create the dataset of edits in March 2025. The dataset has been expanded to include detailed information on reverting editors and predicted revert risk scores. The notebook has been re-run to generate an analogous dataset for edits in October 2024, for which metadata on article maintenance templates will be available (see task T384600). A exploration notebook is already available providing feedback to the proposed question How many editors have reverted at least Y edits in 30 days?.
  • Peacock Support: No major updates this week.
  • Moderator Motivations: Onboarded @MRaishWMF to current progress and next steps; Continued literature review write-up (core sections: broader theories of motivation; existing literature on moderation, motivations and self identification; gaps in knowledge and applicability within the Wikimedia ecosystem); draft for WikiConNA proposal ready for feedback
  • Centralizing Contributions: Finished all participant sessions!
  • Metrics on Patrolling Work: Continued development on the data collection notebook. For the October 2024 snapshot, metadata now includes information on addition and removal of article maintenance templates. The data exploration notebook has been also expanded accordingly. First, it show the distribution of revisions per wiki based on the number of links to Wikipedia project namespaces found in edit summaries. This metric will be used to approximate the question: How many editors provided feedback when rolling back an edit?. In addition, the notebook already includes data addressing the question: How many editors have added a messagebox to an article?.
  • Moderator Motivations: Proposal submitted for WikiConfNA; Presenting current findings and proposed avenues for further research to Applied Sciences meeting on Monday for feedback (special thanks to @diego for providing feedback this week)
  • Metrics on Patrolling Work: (Shorter week for @Pablo) Updated the data collection notebook to incorporate revision tags and the comment of the reverting revision for those who are reverted. The data exploration notebook has been updated to now include the answers to the three questions using the October 2024 dataset. Progress with the data has been reviewed with @Isaac during our 1:1 meeting. In addition, two other meetings have been scheduled with stakeholders as the dataset is expected to be used on an Product Analytics + Moderators Tools teams's effort to measure current moderator activity to inform a Key Result target for WE 1.3 FY 25/26.
  • Peacock Support: The team is trying to address the community concerns. Some editors pointed out that tone-check tool might facilitate the work to spammers/pay-editors. Our contributions from research to discussion are:
    • Acknowledge that this risk is real, and we have shared this as a concern.
    • Suggest some mitigation strategies:
      • Add a tag if the revision triggered the "peacock alarm" before submitting. Because the main paradigm shift were are introducing here is that we are not logging the editing process, and a tag could partially mitigate this problem.
      • We are writing a more complete model description, so editors can understand how the model was trained.
  • Moderator Tools Support: We completed collection on the patroller tools baseline survey, but analysis will be delayed due to deprioritization by moderator tools team. Work is underway to complete the nuke follow-up survey (T396039) by the end of Q4
  • Moderator Motivations: Presented progress to the team on and incorporated feedback and ideas. Prepared overview of motivations (theory and applications) for stakeholders to share on Monday.
  • Centralizing Contributions: We have finished our final report and sent it to stakeholders for review.
    • Key takeaways:
      • Participants unanimously desire a centralized hub to access, tailored to their interests or most common contribution types
      • Participants also want a centralized hub to discover new moderation activities and relevant wiki policies that they can learn about
      • It is easier for editors to understand why they should perform moderation actions, if they are described as an extension of editing, rather than as a separate category of activities entirely
        • Explicitly calling such activities “moderation” is more likely to confuse users, as opposed to labelling them as “additional editing actions”
      • The moderation activities of almost all study participants is limited to the articles on their watchlist or contribution history
      • Users use off-wiki channels to find on-wiki help articles, suggesting a need for better discoverability for help articles and policy pages
      • Users typically found personalized impact metrics (e.g. pageviews on pages they edited or created) more meaningful than site-wide ones (e.g. number of pages in the Articles for Deletion category)
  • Metrics on Patrolling Work: This week’s work focused on building a Superset dashboard to deliver the dataset, which also included modifications to the existing notebooks, e.g., moving the parsing of edit summaries (to extract links to the project namespace) into the data collection phase. The dashboard will be reviewed next week in meetings with colleagues from the Product Analytics and Moderator Tools teams.
  • Peacock Support: Helping to define a procedure to analyze feasibility of using the peacock detection on model in languages were we have limited labeled data.
  • Moderator Tools Support: Patrollers survey stage two deployed, now on larger projects (English, Commons, Wikidata, French, German).
  • Moderator Motivations: Shared overview of motivations slides to moderator research group, discussed ongoing research proposals with @Isaac and now brainstorming more potential mentorship, collaboration & relatedness research avenues (as a subset of motivations)
  • Centralizing Contributions:
    • We have updated our public Meta-Wiki page with our key findings and results
    • Stakeholder share-out is scheduled for later today
    • We can consider this research completed
  • Metrics on Patrolling Work: This week's efforts primarily focused on interactions with the Moderator Tools and Product Analytics teams. During a joint meeting, the dataset and dashboard were presented, and feedback was gathered (notes). Following the session, both teams engaged further to request additional details, particularly in relation to defining baselines for the projected increase in moderation actions for FY 25/26 WE 1.3 KR (T396493). In parallel, while awaiting input from stakeholders regarding specific product interventions to be tracked using this data, report for this project started to be drafted.
  • Peacock Support: Collaborating with @achou to analyze data from languages where we don't have ground-truth to establish an evaluation methodology for those cases.
  • Moderator Motivations: Synced with Daisy about upcoming motivations-related projects and cross-team collaboration. Sharing motivations work at Research meeting next Tuesday.
  • Metrics on Patrolling Work: This week’s work focused on two main areas. First, significant effort was dedicated to reviewing and refining both the dataset and the dashboard. This included adding a new table with metrics suggested by the Moderator Tools team. Second, a preliminary version of the report was completed, documenting the dataset, the dashboard, and key findings. The report has been shared with the stakeholders for feedback.
  • Moderator Tools Support: Nuke tool follow-up survey deployed last week, and is ready for undeployment.
  • Moderator Motivations: Metawiki page with an overview of motivations and potential research avenues posted, and literature has been added to the WMFResearch Zotero library. This research will be presented in a Contributors Strategy meeting in early Q1.
  • Metrics on Patrolling Work: Data pipelines have been developed to generate the dataset, which has been made accessible via a Superset dashboard, as specified here. The analysis with the dashboard has revealed moderation gaps, which have been shared with stakeholders and documented in the revised report with feedback from the stakeholders. The dataset is expected to be used to provide data on the retention rate of patrollers using the FlaggedRevs and reverting editors at T396493 as a comparable moderator retention rate metric to inform targets. Furthermore, additional opportunities to leverage this data have been identified at T398071. As a consequence, the hypothesis is supported.

The majority of projects in this group are now complete, closing this task :)