We need to add version.diff.longest_new_repeated_character credibility signal so that consumers can know the difference in the length of repeated chars in the current revision as compared to that of its parent revision. This is meant to detect button mashing.
The pre-requisite for diff-related signal is the ability to get wikitext for the current and parent rev id. So, T299164 must be implemented before working on this ticket.
Implementation steps:
- Step 1: First, need to update the version schema to include diff (if not already done). Under diff, include longest_new_repeated_character. (Please refer to JSON schema in documentation repo)
{ "identifier": 1063955750, . . "diff" : { . "longest_new_repeated_character": 11 // int . . } }
- Step:2 You will be able to get the revision data with the two wikitexts (current rev and parent rev) in the articleupdate handler using revision utilitiy. Now update the version in article update handler, using the builder pattern. Introduce a diffbuilder (if not already there). Using the wikitext contents of the two revisions, we follow the following logic to compute longest_new_repeated_char:
- Get the longest repeated char length for each wikitext using this logic
- Compare the two lengths. longest_new_repeated_character will be the length of the current rev if it is larger than the parent's rev's length, else longest_new_repeated_char will be 1. (code)