We need to add //version.diff.meta.longest_new_repeated_char //credibility signal so that consumers can know the difference in the length of repeated chars in the current revision as compared to that of its parent revision. This is meant to detect button mashing.
Steps:
- //Step 1:// First, need to update the version schema to include `diff` (if not already done). Under `diff`, include `meta` object. You will need to introduce meta.go schema. For now, let's start with only one element under `meta` schema - `longest_new_repeated_char`.
```
{
"identifier": 1063955750,
.
.
"diff" : {
"meta" : {
"longest_new_repeated_char": 11 // int
.
.
.
},
.
.
}
}
```
- //Step:2// To compute the longest new repeated character, first we need the wikitext for the current rev id and its parent rev id. This can be done by making an API call as below. (try this out in postman)
```
https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Green_Island_(Rideau_River)&rvlimit=2&formatversion=2&format=json&rvprop=content|ids|timestamp
```
Here, rvlimit=2 gets the latest two contents. `ids` and `timestamp` under rvprop is just for illustration. We will be fine with just `content`.
To get this API result, you will have to enhance the [[ https://github.com/protsack-stephan/mediawiki-api-client/blob/master/page_revisions.go#L5 | revision utility ]] such that it can get `content` (This enhancement may already be there as a part of [[ https://phabricator.wikimedia.org/T298669 | this ticket ]].)
Once that is done, you will be able to get the revision data with the latest two contents in the handler (refer articleupdate handler).
- //Step:3// Now update the version in article update handler, using the builder pattern. Introduce a `diffbuilder` (if not already there). Introduce a `metabuilder`. Using the wikitext contents of the two revision, we follow the following logic to compute `longest_new_repeated_char`:
# Get the longest repeated char length for each wikitext using [[ https://github.com/wikimedia/revscoring/blob/275302c7b103513b51cf63b89e81ea051fba4786/revscoring/features/wikitext/features/chars.py#L230 | this logic ]]
# Compare the two lengths. `longest_new_repeated_char` will be the length of the current rev if it is larger than the parent's rev's length, else `longest_new_repeated_char` will be 1. ([[ https://github.com/wikimedia/editquality/blob/1a4ba8333b7aaa9d5ac67e312b8077827e54bf46/editquality/feature_lists/wikitext.py#L5 | code ]])