Page MenuHomePhabricator

Suggestion: API for fetching lint errors for a specific revision
Closed, ResolvedPublic

Description

Copy/pasting from https://www.mediawiki.org/wiki/Topic:Tp60an0dvayt5vhu :

Use cases:
User - interested in finding out on average how many lint errors were added to revisions of a specific article (perhaps because there is complicated markup there).
Researcher - interested in looking at the burden (e.g. cleanup efforts) new or older editor cause other editors.
Tool / script developer - interested in finding out in which revision an error was introduced to revert or to identify the culprit.
Use in extensions - for example, recentchanges could theoretically flag every revision that contains a lint error.
Background
Even in its current state the extension can make it possible to do a lot of analysis on existing data. In addition to the use cases presented above, one could for example look into historical data, e.g. run extra analysis on the Research:VisualEditor%27s_effect_on_newly_registered_editors/June_2013_study dataset to evaluate the number of errors introduced by VisualEditor vs Wikitext editors on page creation or just wikitext editor errors.
This may also be used in the ORES tool (which works on revisions) by giving extra information that can be used to help identify possible revisions containing vandalism (vandals might generally not know wikitext markup).
One possibility would be for an individual to look at their own contributions, and evaluate whether there are patterns of incorrect markup they leave behind that they can improve on. This could also be used by editors to either see if a newbie needs help, or to identify a possible vandal.
Proposed solution
A new api endpoint , e.g.:

api.php?action=query&revids=478198|54872|54894545&prop=linterrors&leprop=count|type|...

Unlike fetching lint errors for arbitrary text (which is useful by itself), this allows for much more flexibility and analysis, without using database dumps or complex scripts.

Details

Related Gerrit Patches:
mediawiki/services/parsoid : masterAdd an API endpoint to get lint errors for wikitext

Related Objects

Event Timeline

Elitre created this task.Apr 27 2017, 3:43 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 27 2017, 3:43 PM
Elitre updated the task description. (Show Details)Apr 27 2017, 3:43 PM

Subbu's comment: "The specific form of this is a bit harder to support since linter is backed by parsoid right now. So, may be a separate endpoint similar to T163091 ... but this will be a lower priority one."

Change 352715 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] Add an API endpoint to get lint errors for wikitext

https://gerrit.wikimedia.org/r/352715

Change 352715 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Add an API endpoint to get lint errors for wikitext

https://gerrit.wikimedia.org/r/352715

ssastry closed this task as Resolved.May 15 2017, 10:22 PM
ssastry assigned this task to Arlolra.
ssastry triaged this task as Normal priority.
ssastry removed a project: Patch-For-Review.
ssastry reopened this task as Open.May 15 2017, 10:25 PM
ssastry added a subscriber: ssastry.

Actually, this requires a RESTBase side fix before this can be accessed by clients.

Arlolra closed this task as Resolved.May 22 2017, 1:58 PM

Actually, this requires a RESTBase side fix before this can be accessed by clients.

The endpoint is exposed,
https://en.wikipedia.org/api/rest_v1/#!/Transforms/post_transform_wikitext_to_lint_title_revision

Actually, this requires a RESTBase side fix before this can be accessed by clients.

The endpoint is exposed,
https://en.wikipedia.org/api/rest_v1/#!/Transforms/post_transform_wikitext_to_lint_title_revisio

It doesn't since wikitext is a required parameter.

Arlolra reopened this task as Open.May 22 2017, 2:03 PM

Whoops :/

Arlolra removed Arlolra as the assignee of this task.May 22 2017, 2:03 PM
Arlolra added a subscriber: Arlolra.

Services should decline if they don't want to expose this.

Pchelolo edited projects, added Services (doing); removed Services.Jun 27 2017, 5:50 PM
Pchelolo added a subscriber: Pchelolo.

Parsoid accepts just title/revision in it's wikitext/to/lint API, so all we need to do it to make wikitext an optional parameter as well, and check that either wikitext or a title is provided. Easy.

Filed a subtask for Parsoid to figure out correct redirects in case only the title is provided but not the revision. This is blocked on it for the time being.

If I understand correctly, the idea would be to fetch the list of lint errors for a specific page/revision combo? If so, I don't think making the wikitext parameter optional is a good idea, as that would imply making POST requests with an empty body. A GET end point would be much more appropriate for this, IMHO. It should be easy to expose it as /page/linterrors/{title}{/revision} (if revision is not supplied, the latest revision is assumed).

@mobrovac I've proposed that option to match the Parsoid API, but providing a GET endpoint works too.

GWicke added a subscriber: GWicke.EditedJun 27 2017, 6:41 PM

Would it make sense to return lint errors as part of the pagebundle response from Parsoid? Also, is Parsoid handling storage for lint errors?

Would it make sense to return lint errors as part of the pagebundle response from Parsoid? Also, is Parsoid handling storage for lint errors?

Unless you want to store lint errors for all revisions, sending lint errors as part of pagebundle isn't necessary. All lint errors for the active revision are stored in a mysql db as part of the Linter extension.

Pchelolo moved this task from doing to blocked on the Services board.Jul 11 2017, 8:20 PM
Pchelolo edited projects, added Services (blocked); removed Services (doing).

Ok since the Parsoid issue was fixed I can do a quick RESTBase patch for this. Do we still need it?

Ok since the Parsoid issue was fixed I can do a quick RESTBase patch for this. Do we still need it?

Yes it would be useful .. especially for https://www.mediawiki.org/wiki/Topic:U99ywo68gg6ufgmg

Pchelolo closed this task as Resolved.Mar 15 2018, 4:46 PM
Pchelolo claimed this task.
Pchelolo edited projects, added Services (done); removed Services (doing).