Page MenuHomePhabricator

Exploration: How might we use data to identify policies/guidelines newcomers commonly break/defy?
Open, MediumPublic

Description

This task will involves an exploration to uncover the ways in which we might use data to identify policies/guidelines newcomers commonly break/defy.

The above is an effort to identify Edit Checks that are likely to be impactful. Where "impactful" here means these policies are:

  • Relevant to substantial number of edits that newcomers are making/attempting to make
  • Likely to result in an edit being reverted if not followed

Event Timeline

Per what @MNeisler and I talked about offline on 20 March 2024, we're going to revise the scope of this task to be a more generic exploration of how we might use data to identify policies/guidelines newcomers commonly break/defy.

The above remain in service of identifying Edit Checks that are likely to improve the edits newcomers are intuitively making.

ppelberg renamed this task from Leverage revert risk model to identify policies common among/relevant to edits assigned a high revert risk to Exploration: How might we use data to identify policies/guidelines newcomers commonly break/defy?.Mar 23 2024, 12:30 AM
ppelberg updated the task description. (Show Details)
MNeisler triaged this task as Medium priority.
MNeisler moved this task from Tracking to Upcoming Quarter on the Product-Analytics board.

@ppelberg I'm documenting some of the additional data exploration ideas that were discussed in today's "KR WE1.2 Steering Committee monthly sync" meeting here as they seem related to the effort to identify edit checks that will be impactful. These can be used to refine this task or moved to a separate task as needed.

Where in the editor workflow might we have the most impact?
Review the proportion of volunteers that reach each stage of an editing workflow to identify the most common point will people abandon their edit. Thinking: If we can identify the points “where” & “why” editors drop out, then we can intervene and support them better in that stage.

What are the most frequently reverted edit types by newcomers and impact on their retention? What's the timeline for the users' next successful/un-reverted edit? Thinking: This data might help provide insight into the types of policies editors commonly break/defy and the reasons editors may abandon or persist on editing.

Notes from what @MNeisler and I talked about offline today...

  1. The main thing we need to do the kind of analysis this ticket is asking for is knowing:
    • Counting the edits a policy we're interested in introducing a Check for is relevant to
    • Calculating the rate at which said edits are reverted or produce some other moderation/corrective action
  2. A few ways we can think of doing the "counting" described above:
    • Look at policy-related tags that are explicitly appended to edits
    • Look at how often a policy-related template is used (read: transcluded onto user talk pages)
    • Write custom logic to categorize edits that are relevant to a particular policy. //This is an approach we took with the Reference Check. See Edit_check/Tags

We could also do some broader/more general research using the Edit Types framework @Isaac created. E.g. we could look at edits that involved someone adding an image, calculating the revert rate of said edits, thinking about the policies that are relevant to edits of this sort, hypothesizing the specific policies people could be breaking, and then ultimately, reviewing a sample of edits to validate the extent to which the hypothesis we formulated hold true.

A few notes from a discussion yesterday with myself, @Pablo, @MNeisler, and @ppelberg:

  • Context: research is doing some work adjacent to this space in T392210: [WE1.5.3] Wikipedia Patrolling Measurement and we wanted to be aware of opportunities for overlap/collaboration.
  • The two areas where that work might be able to feed in nicely here:
    • @Pablo has already done some work around detecting policy mentions in edit summaries and we will try to incorporate that into our patrolling dataset work. It not only is useful for understanding what policies are being broken but also for understanding how often editors receive that direct feedback via edit summaries when reverted.
    • @Pablo developed code for tracking changes to issue templates (citation needed etc.) in T384600. If there are good opportunities to expand out the edit types we track beyond this, that also could be helpful per Peter's comment above about the role edit-types could play
  • Scope of interest to Editing:
    • Top-20 largest languages (where the largest moderator burden is likely to exist)
    • Newcomer essentially means <100 edits here.
    • Focus on main namespace. In this case, also focus on VisualEditor
    • Focus on edits that elicit a "negative" response -- reverts are the most obvious of these but messages to user/talk pages could be another indication (though harder to measure).
    • This data would likely be useful in Q1 as decisions start to be made about edit checks to consider in Q2. The Editing team is pretty open at the moment though ideally the Edit Checks that they work on relate to core Wikimedia policies that have salience across many wikis (even if their implementation/interpretation varies).
  • Outcome: at this point, @MNeisler isn't planning on picking up this ticket and Research isn't intending to answer it directly but we're hopeful that the dataset that comes out of T392210 can be used for these purposes or easily extended to answer these questions. We'll keep each other in the loop where relevant.

All that you described aligns with what I understood as having discussed...thank you for prompting this conversation and synthesizing all that it surfaced, @Isaac.

Outside of anything @MNeisler might add, in the context of "Focus on edits that elicit a "negative" response..." I wonder if in addition to the signals you named (reverts and talk page messages), we might also look for edits that insert a policy/guideline related template (e.g. T389445).//