In previous research T305180 we showed that some tools (such as LanguageTool) can surface copyedits, however, automatic evaluation in the context of Wikipedia is difficult due to lack of ground truth data.
In this task, we want to generate a short list of copyedits with one of the previously discussed methods (LanguageTool, spellcheckers, etc) for manual evaluation in order to assess whether the suggested copyedits are any good (correspond to genuine copyedit-errors). Ideally, the manual evaluation would label each suggested copyedit as good/bad from which we will calculate the fraction of good suggestions (precision).
Specifically we want,
- each copyedit in the list contains: wiki_db, page_title, the sentence of the text, the word/substring that contains the error, (if possible) a suggestion for improvement
- the list of copyedits should be short in order to be managable by a human. this means not more than copyedits for 100 articles (probably less)
- start with copyedits for one or all of the four pilot wikis: ar, bn, cs, es