Page MenuHomePhabricator

Add a link engineering: backend product specifications
Closed, ResolvedPublic

Description

This task describes the product specifications that span both the link recommendation service as well as how it is accessed and utilized in MediaWiki. Specifications on the frontend user experience are tasked separately. The various subtasks of this epic should implement the functionality described here.

The original source of these specifications is this section of the structured tasks planning document. But upon the creation of this task, the task is the current source of the specifications, and the document is deprecated.


Specification: populating the queue (details below).
Implementation: T261408: Add a link engineering: Maintenance script for retrieving, caching, and updating search index

  • There should always be at least 500 link recommendation articles in the queue for all 64 article topics on each wiki where the feature is implemented. Though the UX displays only 39 topics, which are agglomerations of the original 64, having 500 per 64 topics may give us more flexibility in the future and on other platforms.
  • We should not recommended red links. This could happen in the case that a link target article is deleted for a recommendation that has not been refreshed.
  • There are a set of parameters we want to use to tune which articles make it into the queue.:
AttributeInitial setting
Suggested links quantity and qualityMust have at least 4 suggestions per article over X probability score (X should be configurable per wiki). We will only display a maximum of 10 suggestions per article.
Existing linksWe may want to filter out articles that are already populated with many links.
Protection statusExclude articles with any protection
Categories to include/excludeNone
Templates to include/excludeNone
Time since last edit1 day
Time since last suggested links editsDo not include if article's previous edit was link recommendations edit or if previous edit was a revert of a link recommendations edit [1]
Article word count (max/min)None

[1] We want to avoid these two scenarios:

  • User A gets 10 link suggestions on a long article. Adds most of them. Then the article goes back in the queue, gets its suggestions regenerated, and User B chooses it from the queue.* User B adds 10 more suggestions. If that keeps happening, the article could get overlinked.
  • User A gets 10 link suggestions and adds them all. Then they get reverted. Then the article’s suggestions are regenerated and it goes back in the queue, where User B adds those same 10 suggestions again. Then it gets reverted again.

Specification: the above tuning parameters should be configurable on-wiki so that each wiki can have different settings. When we change parameters, we should have the ability to purge the queue and repopulate it with all new articles satisfying the new parameters, but that does not need to happen automatically.
Implementation: T266443: Add Link engineering: On-wiki configuration


Specification: at launch, service should support Czech, Vietnamese, Arabic, and Bangla Wikipedias. It would also be useful to support English, which would help us test, potentially in Test Wikipedia.
Implementation: TBD


Specification: we need to retain a record of which specific link suggestions were accepted, rejected, and skipped. This has two main purposes:

  • Gives us the option to remove rejected suggestions from the list so that future users don't encounter suggestions that have already been rejected once or more. If we continue to display the same rejected suggestion over and over to users, one of them will eventually accept it, even though it is possible for us to know that the majority of users rejected it.
  • Generates data that Research can use to evaluate and improve the algorithm.

Implementation: T266446: Add Link engineering: Provide a mechanism for storing data about which link recommendations were rejected by the user


Specification: we want to be able to provide an entry point to "add a link" suggestions from the reading or editing of an arbitrary article. For a reader, we might say, "Did you know you can edit this article now by easily adding links?". For someone in visual editor, we might say, "There are available AI suggestions for this article."
Implementation: TBD, as this requires more resources from SRE and the database.


Specification: all edits through "add a link" should have an edit tag called "Suggested: add links". The should not also get the "Newcomer task" tag.
Implementation: T266474: Add Link engineering: Edit tag for link recommendations tasks


Specification: record when users complete reviewing an article, but decline to accept any of the suggestions, i.e. they either say "skip" or "no" to all of them. We will want to provide the users (and ourselves) with counts and history that are inclusive of these no-edit reviews, so that we can give credit for them through positive reinforcement. We don't want users to think that they are expected to add a link every time, lest they over-link.
Implementation: T266473: Add Link engineering: Provide a mechanism for recording credit to a user if they review all link recommendations with "no" or "skip"

Related Objects

StatusSubtypeAssignedTask
ResolvedMMiller_WMF
Resolvedkostajh
Resolvedkostajh
DuplicateNone
DeclinedNone
Resolvedkostajh
Resolvedkostajh
ResolvedTgr
Resolvedkostajh
ResolvedTgr
ResolvedTgr
ResolvedTgr
Resolveddcausse
ResolvedEBernhardson
ResolvedTgr
DeclinedNone
InvalidNone
DeclinedNone
Resolvedkostajh
Resolvedkostajh
ResolvedMGerlach
ResolvedMGerlach
Resolvedkostajh
Resolvedkostajh
Resolvedkostajh
Resolvedkostajh
Resolvedkostajh
Resolvedkostajh
ResolvedTgr
ResolvedTgr
Resolvedkostajh
Resolvedkostajh
ResolvedTgr
Resolvedkostajh
ResolvedTgr
ResolvedTgr
ResolvedTgr
DeclinedNone
Resolved Rileych
ResolvedTgr
ResolvedTgr
ResolvedMarostegui
Resolvedkostajh
ResolvedTgr
Resolvedhnowlan
Resolvedkostajh
ResolvedTgr
ResolvedMGerlach
ResolvedTgr
InvalidNone
Openkevinbazira
ResolvedMGerlach
ResolvedMGerlach
Resolvedkostajh
Resolvedkostajh
Resolvedkostajh
ResolvedBUG REPORT mewoph
InvalidNone
OpenNone
ResolvedTgr
ResolvedTgr
ResolvedTgr
ResolvedTgr
DuplicateNone
Resolvedkostajh
ResolvedMGerlach

Event Timeline

kostajh updated the task description. (Show Details)
MMiller_WMF renamed this task from Add Link engineering: Product requirements to Add a link engineering: backend product requirements.Oct 26 2020, 10:29 PM
MMiller_WMF renamed this task from Add a link engineering: backend product requirements to Add a link engineering: backend product specifications.Oct 26 2020, 11:58 PM
MMiller_WMF updated the task description. (Show Details)

Change 672157 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Filter link recommendation links on demand

https://gerrit.wikimedia.org/r/672157

kostajh moved this task from Backlog to Epics on the Add-Link board.

Change 672157 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Filter link recommendation links on demand

https://gerrit.wikimedia.org/r/672157

kostajh triaged this task as Medium priority.May 4 2021, 1:46 PM
kostajh claimed this task.

We've done the initial release; we can keep the subtasks open but I'm closing this epic. cc @Rileych @MMiller_WMF