Page MenuHomePhabricator

Add Longest New Repeated Char Credibility Signal
Closed, ResolvedPublic5 Estimated Story Points

Description

We need to add version.diff.longest_new_repeated_character credibility signal so that consumers can know the difference in the length of repeated chars in the current revision as compared to that of its parent revision. This is meant to detect button mashing.

The pre-requisite for diff-related signal is the ability to get wikitext for the current and parent rev id. So, T299164 must be implemented before working on this ticket.

Implementation steps:

  • Step 1: First, need to update the version schema to include diff (if not already done). Under diff, include longest_new_repeated_character. (Please refer to JSON schema in documentation repo)
{
  "identifier": 1063955750,
   .
   .
   "diff" : {
             .
            "longest_new_repeated_character": 11              // int
             .
             .
           }
}
  • Step:2 You will be able to get the revision data with the two wikitexts (current rev and parent rev) in the articleupdate handler using revision utilitiy. Now update the version in article update handler, using the builder pattern. Introduce a diffbuilder (if not already there). Using the wikitext contents of the two revisions, we follow the following logic to compute longest_new_repeated_char:
  1. Get the longest repeated char length for each wikitext using this logic
  2. Compare the two lengths. longest_new_repeated_character will be the length of the current rev if it is larger than the parent's rev's length, else longest_new_repeated_char will be 1. (code)

Event Timeline

prabhat updated the task description. (Show Details)
prabhat renamed this task from Add version.diff.meta.longest_new_repeated_char Credibility Signal to Add Longest New Repeated Char Credibility Signal.Jan 7 2022, 3:19 PM
prabhat moved this task from Estimated /Discussed to Incoming on the Wikimedia Enterprise board.
prabhat triaged this task as Medium priority.Jan 7 2022, 8:18 PM
Lena.Milenko changed the task status from Open to In Progress.Feb 7 2022, 1:03 PM
Lena.Milenko changed the task status from In Progress to Open.Feb 17 2022, 4:00 PM