Page MenuHomePhabricator

Language-Agnostic Topic Modeling
Closed, ResolvedPublic

Description

Overview

Parent task for language-agnostic topic modeling work being undertaken by the Research team.
Goal: expand our ability to organize and recommend Wikipedia content by high-level topics, specifically by building approaches that are not language-specific and thus should work well for all wikis without additional fine-tuning. Initial work has focused on mapping articles to a pre-defined taxonomy of topics but later work will expand these techniques to also work with ad-hoc, user-defined topics.

Resources

Meta: https://meta.wikimedia.org/wiki/Research:Language-Agnostic_Topic_Classification
Overview: https://docs.google.com/presentation/d/1YRk2lh0Fe2DE_knj7IoBzIXephMhiJcdjE1sF4lak00/edit?usp=sharing

Event Timeline

I think reasonable to close this one out as the core use-case is complete (language-agnostic topic classification that is used by Growth to filter article recommendations). There are things that I want to do still in this space but I can either open up new tasks or put them other under umbrellas (like the list-building work).