In T293035 we identified LanguageTool as a possible candidate to surface copyedits across several languages. Inspecting the copyedits for Wikipedia articles, we identified the potential problem that LanguageTool might raise many copyedits that are false positives (see the example described here T284550#7802765). Therefore, we want to quantitatively evaluate the copyedits from LanguageTool. Specifically, we want to investigate the rate of false positives and how different filters might decrease it.
- Evaluate LanguageTool in a benchmark dataset in English from, e.g., Grammatical Error Correction
- if possible, evaluate LanguageTool in a benchmark dataset in a language that is not English
- Find a ground-truth dataset with copyedits in Wikipedia articles in at least one language (e.g. through hand-labeling)