Page MenuHomePhabricator

Create labeled data for topic models in ar, cs, kowiki
Closed, ResolvedPublic

Event Timeline

Halfak added a subscriber: Isaac.

In T236713: Improve drafttopic training data pipeline, @Isaac has extracted a full dataset of cross-wiki labels as mapped by Wikidata. We can use this to create stratified samples of labeled datasets for use in training models.