Page MenuHomePhabricator

diego (Diego S-T)
Senior Research Scientist

Today

  • No visible events.

Tomorrow

  • No visible events.

Tuesday

  • No visible events.

User Details

User Since
Aug 8 2017, 10:56 AM (444 w, 4 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Diego (WMF) [ Global Accounts ]

Recent Activity

Fri, Feb 13

diego updated the task description for T333892: Develop a new generation of ML models for Wikidata.
Fri, Feb 13, 7:04 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego closed T333892: Develop a new generation of ML models for Wikidata as Resolved.
Fri, Feb 13, 7:00 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego added a comment to T333892: Develop a new generation of ML models for Wikidata.

The model was released in and currently running on LiftWing.

Fri, Feb 13, 7:00 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego closed T372707: research code hand-over and resolve requests/comments from research engineers as Declined.
Fri, Feb 13, 6:58 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego closed T372707: research code hand-over and resolve requests/comments from research engineers, a subtask of T333892: Develop a new generation of ML models for Wikidata, as Declined.
Fri, Feb 13, 6:58 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego added a comment to T383178: Document results on Reference Quality.

The paper was accepted at TheWebConf'26. We are solving small formatting issues with the publication. I'm going to link the paper when is on a public url and close this task then.

Fri, Feb 13, 6:57 PM · Research
diego added a comment to T389809: Check home/HDFS leftovers of aitolkyn.

@Gehel I have reviewed the data and saved what we need to keep. You can delete the leftovers whenever is neccesary. Thanks.

Fri, Feb 13, 6:54 PM · Data-Platform-SRE (2026.01.23 - 2026.02.13), Essential-Work
diego added a comment to T417442: WE1.3.5 Article similarity model.

Progress:

  • This project was started this week
  • We are researching the potential to reuse intermediate outputs from the Language-agnostic Link-based Article Topic Model. By representing article topics as vectors, we can perform similarity measurements while building on a robust, proven pipeline, ultimately saving a significant amount of computational resources.
Fri, Feb 13, 6:53 PM · Research
diego created T417442: WE1.3.5 Article similarity model.
Fri, Feb 13, 6:51 PM · Research

Tue, Feb 10

diego added a comment to T416098: AI/ML Model Request: **Retrieval model for Abstract Wikipedia**.
  • The Research Team have done some explorations on aligning Wikipedia and Wikidata content in 2020.
  • As part of that work we experimented on taking Wikipedia sentences and transform them on Wikidata claims. If I understand correctly, the task described here is the opposite.
  • One of the main challenges there was performing NER correctly, and establishing the relationship. With current technologies (probably SLM) this might be easier.
Tue, Feb 10, 1:24 AM · Research, Machine-Learning-Team

Wed, Feb 4

diego added a comment to T398160: Check home/HDFS leftovers of mnz.

tbomk, all the relevant data/code from Muniza's work has a backup.

Wed, Feb 4, 4:26 PM · Data-Platform-SRE (2026.01.23 - 2026.02.13), Essential-Work

Oct 8 2025

diego closed T398071: Proof of Concept: RecSys for Patrollers as Resolved.
Oct 8 2025, 3:17 PM · Research
diego updated the task description for T398071: Proof of Concept: RecSys for Patrollers.
Oct 8 2025, 3:16 PM · Research
diego added a comment to T398071: Proof of Concept: RecSys for Patrollers.

The work was done. Check links on the task description.

Oct 8 2025, 3:16 PM · Research
diego updated the task description for T398071: Proof of Concept: RecSys for Patrollers.
Oct 8 2025, 3:04 PM · Research

Oct 1 2025

diego added a subtask for T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing: T365732: [SPIKE] Test Automoderator with Revert Risk model for Wikidata.
Oct 1 2025, 8:03 PM · OKR-Work, Goal, Wikimedia Enterprise - Content Integrity, Wikimedia Enterprise, Wikidata, Lift-Wing, Machine-Learning-Team
diego added a parent task for T365732: [SPIKE] Test Automoderator with Revert Risk model for Wikidata: T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing.
Oct 1 2025, 8:03 PM · Wikidata, Moderator-Tools-Team, Spike, Automoderator
diego created T406179: Q2 FY2025-26 Goal: Host Wikidata Revert Risk model on LiftWing.
Oct 1 2025, 8:03 PM · OKR-Work, Goal, Wikimedia Enterprise - Content Integrity, Wikimedia Enterprise, Wikidata, Lift-Wing, Machine-Learning-Team

Sep 4 2025

diego added a comment to T401968: Analyze samples of articles to see how many structured tasks we might be able to generate.

Thanks @MGerlach , I agree that is the main source of pageviews readable from Spark. I'm not aware of anything aggregated in larger buckets.

Sep 4 2025, 10:23 AM · Research, Revise-Tone-Structured-Task, Growth-Team, OKR-Work, Goal, Machine-Learning-Team

Aug 29 2025

diego added a comment to T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Weekly Report

Aug 29 2025, 5:14 PM · Research

Aug 23 2025

diego added a comment to T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Weekly report

Aug 23 2025, 11:41 PM · Research

Aug 16 2025

diego added a comment to T398249: [Q1 FY 25-26 Applied Sciences Team] Building the Foundations Research.

Retrain recommendation:

  • Main conclusion: Models should be retrained at least once a year. After 1 year models lose 1% precision (details in T399726#11090332)
Aug 16 2025, 4:18 PM · Research

Aug 15 2025

diego added a comment to T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Weekly report

Aug 15 2025, 5:12 PM · Research
diego added a comment to T399726: Provide a recommendation on the optimal retraining frequency for ML models.
  1. Recommendation 1
Aug 15 2025, 4:58 PM · Research
diego renamed T399726: Provide a recommendation on the optimal retraining frequency for ML models from Provide a recommendation on the optional frequency for ML models to Provide a recommendation on the optimal retraining frequency for ML models.
Aug 15 2025, 4:22 PM · Research

Aug 10 2025

diego added a comment to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .

There is buggy behavior:

Aug 10 2025, 10:56 AM · Research-engineering, Research

Aug 8 2025

diego added a comment to T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Weekly Update

Aug 8 2025, 1:07 PM · Research

Aug 4 2025

diego added a comment to T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Weekly Update

Aug 4 2025, 2:08 PM · Research

Jul 25 2025

diego added a comment to T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Weekly Update

  • Tone Check edit check
    • Discussing how to incorporate user feedback
    • Designing the structured task
    • Improving the Model card
Jul 25 2025, 2:13 PM · Research

Jul 24 2025

diego closed T398930: Score probability evaluation for languages without enough data, a subtask of T391940: FY2024-25 Q4 Goal: Productionize tone check model, as Resolved.
Jul 24 2025, 3:19 PM · Goal, Machine-Learning-Team
diego closed T398930: Score probability evaluation for languages without enough data as Resolved.
Jul 24 2025, 3:19 PM · Research, Machine-Learning-Team
diego added a comment to T398930: Score probability evaluation for languages without enough data.

He have develop a method to analyze languages without enough evaluation data. A detailed explanation can be found in this Jupyter Notebook.
In summary:

Jul 24 2025, 3:18 PM · Research, Machine-Learning-Team
diego created T400344: Request to add dsaez to analytics-research-admins.
Jul 24 2025, 11:58 AM · SRE, SRE-Access-Requests

Jul 16 2025

diego created T399726: Provide a recommendation on the optimal retraining frequency for ML models.
Jul 16 2025, 2:45 PM · Research
diego added a comment to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .

That creates a substantial dataset as part of the base features dataset (the wikitext and parent wikitext for all revision in these 10+ years, which could be around ~10TB of data. Let's first try with a beefier spark config, e.g. spark.sql.shuffle.partitions=4000, maxExecutors=129, executor-cores 4 --executor-memory 24G. There are also timeout configs to play with, but that is not a fun place to be.

This solution keeps failing. The workaround I found was to run this experiments in chunks of two years (2013 to 2015, 2015 to 2017 ...) and then join the results. This is not optimal because requires manually creating each chunk, but at least solves the problem.

Jul 16 2025, 2:01 PM · Research-engineering, Research

Jul 9 2025

diego added a comment to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .

Is the 12 year period intentional?

Jul 9 2025, 10:24 PM · Research-engineering, Research

Jul 8 2025

diego updated subscribers of T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .
  • @MunizaA spend the last week debugging the system and now is ready to use.
  • The system was deployed and can be found in the Research Airflow instance.
  • I'm currently running a large experiment (12 different training datasets) to study the effect of data "freshness" on the Revert Risk models.

The large experiment had failed. I'll need engineering help me to understand and fix this error. I'm going to coordinate with @Miriam and @fkaelin to decide how to proceed with this issue.

Jul 8 2025, 1:23 PM · Research-engineering, Research

Jul 2 2025

diego added a comment to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .
  • @MunizaA spend the last week debugging the system and now is ready to use.
  • The system was deployed and can be found in the Research Airflow instance.
  • I'm currently running a large experiment (12 different training datasets) to study the effect of data "freshness" on the Revert Risk models.
Jul 2 2025, 5:30 PM · Research-engineering, Research

Jun 27 2025

diego updated the task description for T398071: Proof of Concept: RecSys for Patrollers.
Jun 27 2025, 5:10 PM · Research
diego created T398071: Proof of Concept: RecSys for Patrollers.
Jun 27 2025, 5:05 PM · Research
diego updated subscribers of T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly update:

Jun 27 2025, 2:27 PM · Research (FY2024-25-Research-April-June)

Jun 26 2025

diego added a comment to T219903: Keep research.wikimedia.org landing page updated.

I would like to ask for adding informtion to Publications (2025) and Knowledge Integrity (Publications) page.

Jun 26 2025, 3:44 PM · periodic-update, Research

Jun 23 2025

diego updated the task description for T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .
Jun 23 2025, 2:20 PM · Research-engineering, Research
diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly Update:

Jun 23 2025, 2:10 PM · Research (FY2024-25-Research-April-June)

Jun 13 2025

diego updated subscribers of T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.
  • Survey support desk:
    • Provided feedback for Codex staff survey
    • @DDeSouza updated the LimeSurvey theme to make font size differences less extreme between text elements
  • AI Re-training: @MunizaA had shared a first version of the retraining Airflow DAG, we are working on refining it.
Jun 13 2025, 6:09 PM · Research (FY2024-25-Research-April-June)

Jun 6 2025

diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly update:

Jun 6 2025, 4:19 PM · Research (FY2024-25-Research-April-June)

May 30 2025

diego updated subscribers of T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly Update:

May 30 2025, 3:21 PM · Research (FY2024-25-Research-April-June)

May 23 2025

diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly update:

May 23 2025, 3:00 PM · Research (FY2024-25-Research-April-June)

May 16 2025

diego updated subscribers of T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly Update:

May 16 2025, 5:11 PM · Research (FY2024-25-Research-April-June)

May 14 2025

diego closed T344016: Improvements to Annotool, a subtask of T341820: Evaluate and improve the Revert Risk model for Wikidata., as Resolved.
May 14 2025, 3:36 PM · Research (FY2023-24-Research-April-June)
diego closed T344016: Improvements to Annotool as Resolved.
May 14 2025, 3:36 PM · Research-Freezer

May 13 2025

diego closed T349739: [Annotool] Include additional information on private project exports, a subtask of T344016: Improvements to Annotool, as Resolved.
May 13 2025, 4:25 PM · Research-Freezer
diego closed T349739: [Annotool] Include additional information on private project exports as Resolved.
May 13 2025, 4:25 PM · Research

May 9 2025

diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly update:

May 9 2025, 2:22 PM · Research (FY2024-25-Research-April-June)

May 5 2025

diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly Update:

May 5 2025, 1:15 PM · Research (FY2024-25-Research-April-June)

Apr 27 2025

diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly update:

Apr 27 2025, 11:36 PM · Research (FY2024-25-Research-April-June)

Apr 21 2025

diego added a comment to T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .

Thanks @diego for putting this together! I'll work on prioritizing. A few thoughts / questions in the meantime to consider:

  • What factors should we hold steady? Presumably a uniform number of training examples? Are they randomly sampled from all data before the cut-off though or is there some sort of stratification by time or other approach that should be used?

Given that are considering a system that we can reuse later, I think we should define this as parameters:

  • Label balance: Balanced, Real (random), or desired balance (eg. 0.80 False)
  • Max data: Undefined or fixed
  • Date: start and end date.
Apr 21 2025, 2:39 PM · Research-engineering, Research

Apr 18 2025

diego created T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models .
Apr 18 2025, 4:45 PM · Research-engineering, Research
diego added a comment to T391719: [Q4 FY 24-25 Applied Science] Building the Foundations Research.

Weekly updated:

Apr 18 2025, 1:34 PM · Research (FY2024-25-Research-April-June)

Apr 1 2025

diego added a comment to T326179: Emit revision revert risk scores as a stream and expose in EventStreams API.
Apr 1 2025, 6:35 PM · Data-Engineering, Event-Platform, Machine-Learning-Team, Research
diego added a comment to T326179: Emit revision revert risk scores as a stream and expose in EventStreams API.

Another Q about revertrisk. Are visibilty settings relevant to possible revert risk? E.g. if the comment (or content!) of a revision is not visible, does its revert risk potentially change? Even if it isn't input to the current revertrisk model, could revision visibility settings be a potential signal for a new or different model?

I'm not sure if I'm understanding the question. With the same inputs the results would be always the same. Visibility is not a feature for this model (there is no "is_visible" column in the feature set). Now if lack of visibility blocks feature extraction, then we have a problem. In LiftWing the features are extracted through the MediaWiki API, so if this is blocked, the model won't be able to run. But if we feed the model with data coming from other sources (ie stream data), and then some visibility configuration is changed on the revision, that - by model design- shouldn't change the revert risk score.

If so, then that would mean that for any given revision, the score might not be totally deterministic just on revision content alone?

In theory yes, in practice could be some changes depending when you call the API.
Revert Risk models uses user's related features. In model design we assumed that those were the features of the users at the time of performing the revision that is being evaluated. In practice, when you call the LiftWing API, the users' features are collected with the current information, meaning that if user had changed their status (eg, had new user groups, or make more edits), the numbers final score can change.

Apr 1 2025, 5:45 PM · Data-Engineering, Event-Platform, Machine-Learning-Team, Research

Mar 24 2025

diego closed T382614: Document the Who Are Moderators work publicly, a subtask of T371865: Who are moderators?, as Resolved.
Mar 24 2025, 4:29 PM · Research, Epic
diego closed T382614: Document the Who Are Moderators work publicly as Resolved.
Mar 24 2025, 4:29 PM · Essential-Work, Research

Mar 11 2025

diego added a subtask for T335799: Review papers and give feedback: Unknown Object (Task).
Mar 11 2025, 6:53 PM · Epic, Research-outreach, Research

Mar 4 2025

diego added a comment to T326179: Emit revision revert risk scores as a stream and expose in EventStreams API.

I'm confused, I think in T374440 they are working just with dumps, nothing like Eventstreams.

Mar 4 2025, 7:50 PM · Data-Engineering, Event-Platform, Machine-Learning-Team, Research

Feb 19 2025

diego added a comment to T386645: Evaluate the existing peacock detection model.

Notice that the template above is visualized in articles as "peacock prose"

Feb 19 2025, 11:47 AM · EditCheck, Machine-Learning-Team

Feb 18 2025

diego added a comment to T386645: Evaluate the existing peacock detection model.

Looking for the template: peacock inline (and redirects) in mediawiki_wikitext_history table. In enwiki I found article ~108K matches:

Feb 18 2025, 8:30 PM · EditCheck, Machine-Learning-Team

Feb 11 2025

diego added a comment to T382614: Document the Who Are Moderators work publicly.

[Update] I've created a Meta page, and updated based on the internal report.
I'm going to refine and finalize this page within the next 10 days.

Feb 11 2025, 4:24 PM · Essential-Work, Research
diego changed Due Date from Jan 17 2025, 12:00 AM to Feb 21 2025, 12:00 AM on T382614: Document the Who Are Moderators work publicly.
Feb 11 2025, 4:22 PM · Essential-Work, Research

Jan 20 2025

diego closed T377159: [SDS 1.2.1 B] Test existing AI models for internal use-cases as Resolved.
Jan 20 2025, 3:09 PM · Research

Jan 8 2025

diego added a comment to T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator.
  • The definition includes "creation, revision, and enforcement" - in the quantiative side of this work it seems like enforcement was the focus here. I don't know if this is something you explored at all, but do you think there's a good way for us to track creation and revision of values, rules, and norms too? My initial reaction is to think about cataloguing policy and guideline pages and looking at substantive edits to those pages, but perhaps there's a better method.

I think it would possible to design some methods to get signals for these numbers, following your suggestion would be one approach. However, is difficult to assess the relevance/impact of those edits. Probably a mix of metrics plus some (permanent) qualitative analysis would be required.

  • In the moderation activities dashboard you include upload as a log type indicative of moderation activity - I wondered if you could explain the thought process on this one, because it's not immediately obvious to me that file uploads would be moderation.

@Pablo please can you explain this?

As a preliminary result, we found that moderation-related edits range from less than 1% in some editions, such as German (0.09%) and Polish (0.53%), to nearly 10% in others, like Russian (9.6%).

  • Does this include bot edits? Just curious, as I notice that a substantial % of the Russian Wikipedia moderation activity comes from bots.

Those numbers were on human edits.

From the report:

eswiki % Moderation (considering revert-related): 35.71%

This seems huge! Especially compared to the values for other wikis. Do we have any insight on what this is driven by?

We noticed this, but didn't have time during this work to analyze specific cases. Given that we consider just one month of data , October 2024, results might be affect by some specific (exogenous or endogenous) events or edit wars. It is important to highlight that our goal here was to understand which actions were measurable, and show how stable or sparse were those numbers. To get actionable insights about specific project it would be necessary to apply these methods on larger data.

Jan 8 2025, 6:41 PM · Research (FY2024-25-Research-October-December), OKR-Work

Jan 7 2025

diego updated subscribers of T383178: Document results on Reference Quality.
Jan 7 2025, 10:56 PM · Research
diego changed the status of T383178: Document results on Reference Quality from Open to In Progress.
Jan 7 2025, 10:54 PM · Research
diego created T383178: Document results on Reference Quality.
Jan 7 2025, 10:54 PM · Research

Jan 2 2025

diego changed the status of T372707: research code hand-over and resolve requests/comments from research engineers, a subtask of T333892: Develop a new generation of ML models for Wikidata, from Open to Stalled.
Jan 2 2025, 5:50 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego changed the status of T372707: research code hand-over and resolve requests/comments from research engineers from Open to Stalled.
Jan 2 2025, 5:50 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego closed T314384: Develop a ML-based service to predict reverts on Wikipedia(s) as Resolved.
Jan 2 2025, 5:49 PM · Machine-Learning-Team, Research, Epic
diego changed the status of T333892: Develop a new generation of ML models for Wikidata from In Progress to Stalled.
Jan 2 2025, 5:48 PM · Research-Freezer, Epic, Wikidata data quality and trust, Wikidata, address-knowledge-gaps, Knowledge-Integrity
diego closed T321358: Develop a public response research narrative for knowledge integrity research as Declined.
Jan 2 2025, 5:47 PM · Research-Freezer, Knowledge-Integrity
diego closed T357036: References Model: Multilingual Reference Need , a subtask of T357033: Reference model Research and Development work, as Resolved.
Jan 2 2025, 5:46 PM · Research (FY2024-25-Research-July-September), Wikimedia Enterprise
diego closed T357036: References Model: Multilingual Reference Need as Resolved.
Jan 2 2025, 5:46 PM · Research

Dec 20 2024

diego updated the task description for T377159: [SDS 1.2.1 B] Test existing AI models for internal use-cases.
Dec 20 2024, 10:05 PM · Research
diego closed T372479: RS Support for SDS 1.2.2, a subtask of T368791: SDS 1.2.2 Causes behind human administration recruiting, retention, or departure patterns, as Resolved.
Dec 20 2024, 7:08 PM · Research, OKR-Work
diego closed T372479: RS Support for SDS 1.2.2 as Resolved.
Dec 20 2024, 7:08 PM · Research
diego added a comment to T372479: RS Support for SDS 1.2.2.

Report is finished, so I'm closing this task.

Dec 20 2024, 7:08 PM · Research
diego closed T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator, a subtask of T371865: Who are moderators?, as Resolved.
Dec 20 2024, 7:07 PM · Research, Epic
diego closed T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator as Resolved.
Dec 20 2024, 7:06 PM · Research (FY2024-25-Research-October-December), OKR-Work
diego added a comment to T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator.

Briefly describe what was accomplished over the course of the hypothesis work

  • For the first time, we established a formal definition of “moderators”: We define Moderators as the human actors responsible for social, technical and governance work needed to sustain an online community, including the creation, revision and enforcement of community values, rules, and norms.
  • To measure moderator activity, we drew on our prior qualitative knowledge of patrolling and admins work and conducted an extensive review of research literature and internal reports, resulting in a comprehensive list of 81 traceable moderation actions. We classified these actions based on their relevance to moderation, measurability, availability, and other dimensions.
  • We assessed the feasibility of measuring each moderation activity and decided to focus on 12 key actions for this hypothesis, measuring them across 13 different Wikipedia language editions: dewiki, arzwiki, plwiki, nlwiki, itwiki, frwiki, eswiki, svwiki, zhwiki, enwiki, jawiki, and ruwiki.
  • To measure the 12 key actions, we leveraged and expanded our previous work on edit classification (a.k.a edit types) to distinguish between moderator and non-moderator edits.
  • Additionally, we needed to create ad-hoc datasets of HTML article versions (T380871) to capture complex moderation activities that are difficult or impossible to detect with existing data. To achieve this within a short timeframe, we leveraged previous work by research engineers.
  • Based on the work described above, we developed an initial approach to measure moderation activities, focusing on the 12 key actions and 13 language editions previously mentioned:
    • As a preliminary result, we found that moderation-related edits range from less than 1% in some editions, such as German (0.09%) and Polish (0.53%), to nearly 10% in others, like Russian (9.6%).
    • We also developed a prototype dashboard to track logged moderation activities, demonstrating its potential for monitoring moderation efforts within our infrastructure.
    • These results are a proof of concept, demonstrating the potential of measuring and tracking moderation activities. However, they should not be considered final, due to the limited number of actions tracked and the reliance on ad-hoc data, which is not available in our infrastructure.
  • More details can be found in the final report.
Dec 20 2024, 7:05 PM · Research (FY2024-25-Research-October-December), OKR-Work

Dec 17 2024

diego added a comment to T372479: RS Support for SDS 1.2.2.

Hi @Easikingarmager , I've already coordinated with Caroline. Also, I'm going to present in the next group meeting.

Dec 17 2024, 3:29 PM · Research

Dec 14 2024

diego updated the task description for T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator.
Dec 14 2024, 7:46 PM · Research (FY2024-25-Research-October-December), OKR-Work
diego added a comment to T380569: Baseline Experiments for SDS 1.2.1 B.

Great job @Aitolkyn ! Can you please share a link to the code you have used to generate these visualizations?

Dec 14 2024, 6:18 PM · Research (FY2024-25-Research-January-March)

Dec 6 2024

diego added a comment to T377324: [SDS 1.2.3] Quantitative lead to support the definition of moderators.

Thanks for the report , I'm not sure how to interpret this:

Dec 6 2024, 5:23 PM · Research (FY2024-25-Research-October-December), OKR-Work

Nov 29 2024

diego added a comment to T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator.

Progress update on the hypothesis for the week

Nov 29 2024, 5:15 PM · Research (FY2024-25-Research-October-December), OKR-Work

Nov 26 2024

diego added a comment to T380779: Enable Peacock model to identify positive examples with finer granularity.

The main challenge here is to find data to train and test this model. Currently, the data we have is at article label. I see to possible ways to work around this problem:

Nov 26 2024, 5:43 PM · Research, EditCheck, VisualEditor
diego added a comment to T380781: Investigate expanding language coverage for Peacock detection model.

We would be able to provide insights on this regard when we finalize T377159 , I'll keep this task updated with those results.

Nov 26 2024, 5:30 PM · EditCheck, Editing-team, VisualEditor
diego updated the task description for T377157: Support SDS 1.2.1 B.
Nov 26 2024, 5:28 PM · Research

Nov 22 2024

diego added a comment to T376684: [SDS 1.2.3] Develop a working definition for moderation activity and moderator.

Progress update on the hypothesis for the week

Nov 22 2024, 7:38 PM · Research (FY2024-25-Research-October-December), OKR-Work

Nov 20 2024

diego assigned T378617: Update mwedittypes to handle HTML diffs to fkaelin.
Nov 20 2024, 3:58 PM · Research-Freezer, Research-engineering
diego assigned T378761: HTML diff dataset for SDS 1.2.3 to fkaelin.
Nov 20 2024, 3:58 PM · Research-engineering, Research

Nov 8 2024

diego added a subtask for T371865: Who are moderators?: T360794: Implement stream of HTML content on mw.page_change event.
Nov 8 2024, 5:58 PM · Research, Epic