In T234272: Newcomer tasks: evaluate topic matching prototypes, ambassadors evaluated several different methods for topic matching: morelike, ORES, and free-text. From those results, we decided to prefer ORES, but to rebuild the ORES models so that they perform better, have a more detailed ontology, and have full coverage in non-English languages.
The new models are ready, and we want to evaluate these, too. There are two models we're evaluating:
- The "crosswalk model": this is the model that is built and scored on English Wikipedia, and then the scores are applied to the articles in local languages that are also found in English.
- The "local model": this is the model that is first built in English Wikipedia, then rebuilt in the local languages to ensure full coverage of all articles.
There are some important details with how these models' ontology works:
- There are now 64 topics, which is 25 more than before. This provides more detail.
- "STEM" refers to "Science, Technology, Engineering, Math".
- There are topics that have a *, such as STEM.STEM*. These are the "catch-all" topics. For instance, STEM.STEM* should list all kinds of science articles, and Geography.Regions.Asia.Asia* should list all kinds of Asia articles, regardless of region.
Here is how:
- Go to the "ORES (2020)" tab in this spreadsheet.
- Open up the new prototype here.
- Choose your language.
- For each topic in the dropdown, select some of the task types and run two searches:
- Crosswalk: run one search with the second switch turned on -- the one that says "Only get tasks with topics returned from enwiki ORES, ignore local wiki ORES models."
- Local: run one search with the second switch turned off.
- Do not use the first switch at all, the one that says, "Only return article when topic is top-ranked match from ORES (does not do anything if no topics are selected)." That should be left off.
- After each of the two searches, look at the first ten articles that are returned, and count how many are good matches for the selected topic.
- Record the crosswalk score (switch turned on) in the "crosswalk" column of the spreadsheet.
- Record the local score (switch turned off) in the "local" column of the spreadsheet.
- If you notice any problems or weird patterns with the models that should be investigated or fixed, please add those as a comment in the spreadsheet. For instance, you may notice that a topic about science is showing a lot of articles about sports, or something like that. Or that a topic doesn't have any articles.