This task describes the product specifications that span both the link recommendation service as well as how it is accessed and utilized in MediaWiki. Specifications on the frontend user experience are tasked separately. The various subtasks of this epic should implement the functionality described here.
The original source of these specifications is this section of the structured tasks planning document. But upon the creation of this task, the task is the current source of the specifications, and the document is deprecated.
Specification: populating the queue (details below).
Implementation: T261408: Add a link engineering: Maintenance script for retrieving, caching, and updating search index
- There should always be at least 500 link recommendation articles in the queue for all 64 article topics on each wiki where the feature is implemented. Though the UX displays only 39 topics, which are agglomerations of the original 64, having 500 per 64 topics may give us more flexibility in the future and on other platforms.
- We should not recommended red links. This could happen in the case that a link target article is deleted for a recommendation that has not been refreshed.
- There are a set of parameters we want to use to tune which articles make it into the queue.:
Attribute | Initial setting |
---|---|
Suggested links quantity and quality | Must have at least 4 suggestions per article over X probability score (X should be configurable per wiki). We will only display a maximum of 10 suggestions per article. |
Existing links | We may want to filter out articles that are already populated with many links. |
Protection status | Exclude articles with any protection |
Categories to include/exclude | None |
Templates to include/exclude | None |
Time since last edit | 1 day |
Time since last suggested links edits | Do not include if article's previous edit was link recommendations edit or if previous edit was a revert of a link recommendations edit [1] |
Article word count (max/min) | None |
[1] We want to avoid these two scenarios:
- User A gets 10 link suggestions on a long article. Adds most of them. Then the article goes back in the queue, gets its suggestions regenerated, and User B chooses it from the queue.* User B adds 10 more suggestions. If that keeps happening, the article could get overlinked.
- User A gets 10 link suggestions and adds them all. Then they get reverted. Then the article’s suggestions are regenerated and it goes back in the queue, where User B adds those same 10 suggestions again. Then it gets reverted again.
Specification: the above tuning parameters should be configurable on-wiki so that each wiki can have different settings. When we change parameters, we should have the ability to purge the queue and repopulate it with all new articles satisfying the new parameters, but that does not need to happen automatically.
Implementation: T266443: Add Link engineering: On-wiki configuration
Specification: at launch, service should support Czech, Vietnamese, Arabic, and Bangla Wikipedias. It would also be useful to support English, which would help us test, potentially in Test Wikipedia.
Implementation: TBD
Specification: we need to retain a record of which specific link suggestions were accepted, rejected, and skipped. This has two main purposes:
- Gives us the option to remove rejected suggestions from the list so that future users don't encounter suggestions that have already been rejected once or more. If we continue to display the same rejected suggestion over and over to users, one of them will eventually accept it, even though it is possible for us to know that the majority of users rejected it.
- Generates data that Research can use to evaluate and improve the algorithm.
Implementation: T266446: Add Link engineering: Provide a mechanism for storing data about which link recommendations were rejected by the user
Specification: we want to be able to provide an entry point to "add a link" suggestions from the reading or editing of an arbitrary article. For a reader, we might say, "Did you know you can edit this article now by easily adding links?". For someone in visual editor, we might say, "There are available AI suggestions for this article."
Implementation: TBD, as this requires more resources from SRE and the database.
Specification: all edits through "add a link" should have an edit tag called "Suggested: add links". The should not also get the "Newcomer task" tag.
Implementation: T266474: Add Link engineering: Edit tag for link recommendations tasks
Specification: record when users complete reviewing an article, but decline to accept any of the suggestions, i.e. they either say "skip" or "no" to all of them. We will want to provide the users (and ourselves) with counts and history that are inclusive of these no-edit reviews, so that we can give credit for them through positive reinforcement. We don't want users to think that they are expected to add a link every time, lest they over-link.
Implementation: T266473: Add Link engineering: Provide a mechanism for recording credit to a user if they review all link recommendations with "no" or "skip"