Explore ways to restrict suggestions to a given knowledge area
Content translation provides users with suggestions of articles to translate. Based on user feedback, it would be useful to provide some general control about the knowledge area these suggestions are about (T113257).

Currently, the suggestion system provides either general suggestions or suggestions based on a given "seed article" used as an example to get similar suggestions.

As an initial step, this ticket tries to explore possible answers the following questions:

  • How to represent the wide knowledge areas(e.g., science, philosophy...)? (e.g., vital article structure, categories, ORES topic model, Wikidata, etc.)
  • How to use the current recommendation system or expand it to get suggestions on a selected area? (e.g., pick a random article from category and use as seed)

This ticket is focused on the technical exploration.
The mockups below are just illustrations of how this could be supported in Content translation:

Structured set of categories for user to pickSearch for specific categories (with a predefined initial set)

Related and very relevant exploration and experiments from the Growth team: T231506#5487829

diego added a comment.Jul 29 2019, 4:39 PM

About this:

How to represent the wide knowledge areas(e.g., science, philosophy...)?

I've been pushing for start a project on cross-lingual topic model, that allow us to make topic comparisons across languages. Unfortunately, this has been not prioritized for the current fiscal, although that we are receiving this requirement for multiple teams within the WMF.

However, I'll be happy to share some ideas of how to tackle this problem, although I wont have enough time to implement and test these solutions.

Another idea worth considering is for users to search/browse for their topic area using Categories. Then the system can pick 5 articles from that category and use them as seeds to get recommendations, remove duplicates and show in the suggestions.

The Growth team is investigating ways to approach this very challenge in T231506.

