Page MenuHomePhabricator

Newcomer tasks: update topic task suggestion backend to handle multiple topic search methods
Closed, ResolvedPublic

Description

Soon we'll switch to ORES-based topic filtering, which in itself is not too much of a change (just a slight difference in configuration format and in what search string to generate), but we'll probably want to keep the ability to either search method instead of just replacing the morelike search code with the ORES one (because that will allow us to roll out gradually across wikis, keep morelike as a fallback, run an A/B test etc). So we'll probably want some sort of "search strategy" abstraction in the backend.

Event Timeline

@MMiller_WMF one relevant question is, will we end up with the same topic list we use currently? If we do, we can just reuse the same on-wiki configuration pages and add the ORES config as another field next to the morelike config. If it's going to be a different set of topics, we'll probably want to use a new configuration page, and then we'll need some changes to the configuration loading logic too.

@Tgr -- we will have different topic lists than the ones we are using for morelike. The ORES models are built with a different ontology that we think is better than the old morelike one. Two points to think about as you work on this:

  • The scores from the new ontology will need to combined or rolled-up in certain ways that we are still determining. For instance, in order for an article to be "Science" in the UI, it means it has to have a high score for "Chemistry", "Physics", or "Biology".
  • The ontology will likely evolve in the future. This won't be frequent, but we expect it to happen. Topics may get reorganized, added, or subtracted. Will we be able to handle that gracefully?

I created T244192: Newcomer tasks: ORES ontology mapping and score thresholds to sort out exactly how we will roll up and use the ontology. To what extent are you blocked on the decisions in that task?

Will we be able to handle that gracefully?

The straightforward approach would be to define the roll-up in the per-wiki JSON config page, so all of those pages would have to be edited when such a change happens. Does that still fit into "gracefully"? Or do we want a single cross-wiki location for defining which ORES topics combine into a given suggested edit topic?

To what extent are you blocked on the decisions in that task?

The question in T244192#5858481 affects configuration file format and search code a little bit, but it's a relatively trivial change that can be done separately. So not really blocked.

Change 571403 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] [WIP] Refactor task suggestion backend to support multiple strategies

https://gerrit.wikimedia.org/r/571403

Change 571624 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Remove state from PageLoader

https://gerrit.wikimedia.org/r/571624

In hindsight the code changes here were probably not strictly necessary; they did make the code nicer though.

Will still need another patch for the configuration loading changes once we have the config format figured out.

Change 571403 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Refactor task suggestion backend to support multiple strategies

https://gerrit.wikimedia.org/r/571403

Change 571624 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Remove state from PageLoader

https://gerrit.wikimedia.org/r/571624

Change 572135 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] [WIP] Add backend support for ORES topics

https://gerrit.wikimedia.org/r/572135

Now ready for review for reals. Sorry about the half-baked earlier version.

Change 572135 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add backend support for ORES topics

https://gerrit.wikimedia.org/r/572135

The commit summary of the last patch says "Sorting the topics (including making use of the 'groups' field) is left to another patch." which I then forgot about... Filed now as T246061: Newcomer tasks: Sort topics alphabetically.