Perform a quantitative evaluation for the list-building tools (see API [1]) developed in T266768. The tools are aimed to define custom (ad-hoc) topics in contrast to the pre-defined ORES topics. The question is how well these approaches work. One relevant task is to automatically generate lists of articles belonging to a given topic, such as climate change. We use wikiprojects-labels as a ground-truth dataset for different (arbitrary) topics. Starting from a suitable input-article(s) of the given wikiproject, we compare the output of the list-building tools with the articles contained in the corresponding wikiprojects.
- generate a curated dataset of wikiprojects and contained articles (overlap T238437)
- identify input-articles characterizing the corresponding wikiproject
- Query different list-building tools and quantify overlap to ground-truth