Bump threshold for confidence score on link recommendation service suggestions
Closed, ResolvedPublic
Actions

Description

The default threshold for generating a link suggestion is 0.5. We can consider raising this to 0.6 or 0.7. That would have the following effects:

The suggestions presented to the end user will have a higher likelihood of being good quality links, and will be less likely to be reverted.
- Indirectly, that might impact T301096: Add a link: prioritize suggestions of underlinked articles, because theoretically the higher quality links will be less likely to be unlinked phrases in heavily-linked articles.
For each article, the link recommendation service will identify fewer phrases as link suggestions (e.g. instead of 5 phrases, it might find 1 or 2).
- It's hard to say how many fewer suggestions we would get per article. If we wanted to find out, we could write a fairly straightforward script to iterate over cached link recommendations in the database and gather statistics about the confidence score for each suggestion.
Because we have a minimum threshold of two suggestions for an article to be considered as a candidate link recommendation task, there will be fewer articles in the task queue, and/or it will take longer to repopulate the task queue for each wiki.
- Task pool sizes seem to stay fairly constant (https://grafana.wikimedia.org/d/vGq7hbnMz/special-homepage-and-suggested-edits?orgId=1&from=now-7d&to=now), but smaller wikis may be more impacted by this change. It's hard to say in advance. On medium-sized and larger wikis, you'd be unlikely to have a scenario where there aren't enough tasks in the queue.

Acceptance Criteria

The threshold for link suggestions is set at a higher value: 0.6
Run revalidateLinkRecommendations.php on the affected wikis

Details

Subject	Repo	Branch	Lines +/-
revalidateLinkRecommendations: Load scoreLessThan correctly	mediawiki/extensions/GrowthExperiments	wmf/1.41.0-wmf.22	+1 -1
revalidateLinkRecommendations: Load scoreLessThan correctly	mediawiki/extensions/GrowthExperiments	master	+1 -1
Revert "Growth: Temporarily disable link-recommendation FE on arwiki"	operations/mediawiki-config	master	+1 -1
Growth: Temporarily disable link-recommendation FE on arwiki	operations/mediawiki-config	master	+1 -1
revalidateLinkRecommendations: Make it possible to revalidate based on score	mediawiki/extensions/GrowthExperiments	wmf/1.41.0-wmf.22	+13 -0
revalidateLinkRecommendations: Make it possible to revalidate based on score	mediawiki/extensions/GrowthExperiments	wmf/1.41.0-wmf.20	+13 -0
revalidateLinkRecommendations: Make it possible to revalidate based on score	mediawiki/extensions/GrowthExperiments	master	+13 -0
LinkRecommendationTaskType: Raise score threshold to 0.6	mediawiki/extensions/GrowthExperiments	master	+1 -1

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	KStoller-WMF	T276517 [EPIC] Growth: "add a link" structured task 3.0
Resolved	KStoller-WMF	T315732 [EPIC] Structured Tasks: Patroller Focus
Resolved	Urbanecm_WMF	T316079 Bump threshold for confidence score on link recommendation service suggestions

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

We would need to run revalidateLinkRecommendations.php on the affected wikis, otherwise it will take forever for the change to take effect. We should probably add a validation option to the script where it checks the "cheap" tasktype properties (link score, min links per task), maybe even updates the recommendation by filtering out below-threshold links if there enough links to do that.

KStoller-WMF updated the task description. (Show Details)Aug 26 2022, 11:41 PM

See also T317290: Provide a mechanism to regenerate link recommendation task pool after configuration changes for some potential overlap with the refresh work needed for this task.

@KStoller-WMF is this task something that we should prioritize doing in the next week or two?

It's not urgent, but I agree this is a task that we should work on soon.

KStoller-WMF triaged this task as Medium priority.Sep 9 2022, 4:56 PM

KStoller-WMF moved this task from Incoming to Ready for Development on the Growth-Team (Sprint 0 (Growth Team)) board.

Change 832639 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/GrowthExperiments@master] LinkRecommendationTaskType: Raise score threshold to 0.6

https://gerrit.wikimedia.org/r/832639

gerritbot added a project: Patch-For-Review.Sep 16 2022, 12:50 PM

kostajh claimed this task.Sep 16 2022, 12:50 PM

kostajh moved this task from Ready for Development to Code Review on the Growth-Team (Sprint 0 (Growth Team)) board.

Change 832639 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] LinkRecommendationTaskType: Raise score threshold to 0.6

https://gerrit.wikimedia.org/r/832639

ReleaseTaggerBot added a project: MW-1.40-notes (1.40.0-wmf.2; 2022-09-19).Sep 18 2022, 7:00 PM

Maintenance_bot removed a project: Patch-For-Review.Sep 18 2022, 7:30 PM

Trizek-WMF updated the task description. (Show Details)Sep 21 2022, 12:39 PM

Trizek-WMF subscribed.Sep 21 2022, 1:04 PM

kostajh moved this task from Code Review to QA on the Growth-Team (Sprint 0 (Growth Team)) board.Oct 14 2022, 12:48 PM

For quite few deployments no regression was noticed in regards to pool size of suggested links and any other regression issues. I looked at several wikis at e.g. https://grafana.wikimedia.org/d/vGq7hbnMz/special-homepage-and-suggested-edits?orgId=1&from=now-30d&to=now&viewPanel=31 - there are some wikis that have declining pool size but the count of tasks is still sufficiently high; and several wikis have recovered their drop in the number of tasks.

Don't we want to run the revalidate script, though? Some (maybe most) tasks probably still have the old confidence score.

In T316079#8525210, @Tgr wrote:

Don't we want to run the revalidate script, though? Some (maybe most) tasks probably still have the old confidence score.

Yeah, that was the second checkmark in the task description.

I would also be interested in a maintenance script that could pull the cached entries and provide some aggregate data to us about the metadata for the cache entries, like the confidence score, the dataset used to generate the recommendation, number of links, etc.

kostajh removed kostajh as the assignee of this task.Mar 7 2023, 11:45 AM

Run revalidateLinkRecommendations.php on the affected wikis

@kostajh - Should we still complete this last task?

This is the last lingering task for a Epic we should officially resolve: T315732: [EPIC] Structured Tasks: Patroller Focus.

kostajh mentioned this in T278083: Define SLIs/SLOs for link recommendation service.Apr 25 2023, 11:12 AM

In T316079#8534093, @kostajh wrote:

In T316079#8525210, @Tgr wrote:

Don't we want to run the revalidate script, though? Some (maybe most) tasks probably still have the old confidence score.

Yeah, that was the second checkmark in the task description.

I would also be interested in a maintenance script that could pull the cached entries and provide some aggregate data to us about the metadata for the cache entries, like the confidence score, the dataset used to generate the recommendation, number of links, etc.

For the SLI discussion (T278083: Define SLIs/SLOs for link recommendation service) we'd like to have a maintenance script that can:

collect statistics on age of cached entries and emit this to Grafana
track which dataset IDs are used and emit to Grafana
track which newcomertasks.json revision ID was used for config and emit to Grafana

That would provide additional useful data to SRE in determining if the service is not working quickly enough.

We could also consider tracking the rate of link recommendation task completion / task pool size, with the idea that this line should be fairly constant across each wiki, but that probably deserves a separate task.

In T316079#8794782, @KStoller-WMF wrote:

Run revalidateLinkRecommendations.php on the affected wikis

@kostajh - Should we still complete this last task?

Yes, I think so, but IMHO we should first write a maintenance script to analyze the cached link recommendation contents, which we can then re-use for improved monitoring in T278083: Define SLIs/SLOs for link recommendation service.

In T316079#8811882, @kostajh wrote:

In T316079#8794782, @KStoller-WMF wrote:

Run revalidateLinkRecommendations.php on the affected wikis

@kostajh - Should we still complete this last task?

Yes, I think so, but IMHO we should first write a maintenance script to analyze the cached link recommendation contents, which we can then re-use for improved monitoring in T278083: Define SLIs/SLOs for link recommendation service.

Do we need to create a Phab task for writing a maintenance script?

If possible we should try to wrap up this task soon, or admit we can't fit it in and move it out of the current sprint.

KStoller-WMF mentioned this in T332089: Community discussion about "add a link" with Arabic Wikipedia.Jul 25 2023, 4:52 PM

KStoller-WMF mentioned this in T342679: Link Recommendations: write a maintenance script to analyze the cached link recommendation contents.Jul 25 2023, 5:06 PM

I'm breaking out the follow up work into a separate task: T342679: Link Recommendations: write a maintenance script to analyze the cached link recommendation contents

Acceptance Criteria has been met:

The threshold for link suggestions is set at a higher value: 0.6

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/832639/

Run revalidateLinkRecommendations.php on the affected wikis

https://sal.toolforge.org/production?p=0&q=%22revalidateLinkRecommendations.php%22&d=

KStoller-WMF assigned this task to kostajh.Jul 25 2023, 5:09 PM

KStoller-WMF updated the task description. (Show Details)

Change 948671 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] revalidateLinkRecommendations: Make it possible to revalidate based on score

https://gerrit.wikimedia.org/r/948671

gerritbot added a project: Patch-For-Review.Aug 14 2023, 10:49 PM

As suggested by @Tgr in Slack, I did an one-off analysis on stat1005 regarding the arwiki task pool (and how many of the tasks meet the current 0.6 criteria):

from wmfdata import mariadb
import json

df = mariadb.run('''
SELECT
    gelr_revision,
    JSON_EXTRACT(gelr_data, '$**.score') AS scores
FROM growthexperiments_link_recommendations
''', 'arwiki', use_x1=True)

df['scores'] = df.scores.apply(lambda x: json.loads(x.decode('utf-8')))
df['max_score'] = df.scores.apply(lambda x: min(x))
df['meets_0.6'] = df.max_score.apply(lambda x: x >= 0.6)

df[['meets_0.6', 'gelr_revision']].groupby('meets_0.6').count()

The results (for arwiki):

meets_0.6	gelr_revision
False	814
True	326

This means only ~28% of suggestions are all acceptable (the analysis used the minimum link score, so more suggestions might have an acceptable link; I focused on suggestions with all links meeting the .6 threshold). FTR, on other pilots, the numbers are much more favourable (more suggestions meet the threshold).

Let's do the revalidation then. Uploaded a patch to make it possible in a targeted way (revalidating everything is possible, but time expensive).

Change 948671 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] revalidateLinkRecommendations: Make it possible to revalidate based on score

https://gerrit.wikimedia.org/r/948671

Maintenance_bot removed a project: Patch-For-Review.Aug 16 2023, 6:32 PM

Change 949576 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.41.0-wmf.22] revalidateLinkRecommendations: Make it possible to revalidate based on score

https://gerrit.wikimedia.org/r/949576

Change 949577 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.41.0-wmf.20] revalidateLinkRecommendations: Make it possible to revalidate based on score

https://gerrit.wikimedia.org/r/949577

Change 949577 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.41.0-wmf.20] revalidateLinkRecommendations: Make it possible to revalidate based on score

https://gerrit.wikimedia.org/r/949577

Change 949576 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.41.0-wmf.22] revalidateLinkRecommendations: Make it possible to revalidate based on score

https://gerrit.wikimedia.org/r/949576

Mentioned in SAL (#wikimedia-operations) [2023-08-17T13:23:05Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:949582|cross-wiki userrights: Add SpecialUserRights::getDisplayUsername (T344391 T255309)]], [[gerrit:949577|revalidateLinkRecommendations: Make it possible to revalidate based on score (T316079)]], [[gerrit:949576|revalidateLinkRecommendations: Make it possible to revalidate based on score (T316079)]]

Mentioned in SAL (#wikimedia-operations) [2023-08-17T13:23:55Z] <urbanecm@deploy1002> urbanecm: Backport for [[gerrit:949582|cross-wiki userrights: Add SpecialUserRights::getDisplayUsername (T344391 T255309)]], [[gerrit:949577|revalidateLinkRecommendations: Make it possible to revalidate based on score (T316079)]], [[gerrit:949576|revalidateLinkRecommendations: Make it possible to revalidate based on score (T316079)]] synced to the testservers mwdebug2001.codfw.wmnet, mwdebug2002.c

Mentioned in SAL (#wikimedia-operations) [2023-08-17T13:28:52Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:949582|cross-wiki userrights: Add SpecialUserRights::getDisplayUsername (T344391 T255309)]], [[gerrit:949577|revalidateLinkRecommendations: Make it possible to revalidate based on score (T316079)]], [[gerrit:949576|revalidateLinkRecommendations: Make it possible to revalidate based on score (T316079)]] (duration: 05m 46s)

Maintenance_bot removed a project: Patch-For-Review.Aug 17 2023, 1:31 PM

Change 949988 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Growth: Temporarily disable link-recommendation FE on arwiki

https://gerrit.wikimedia.org/r/949988

Change 949988 merged by jenkins-bot:

[operations/mediawiki-config@master] Growth: Temporarily disable link-recommendation FE on arwiki

https://gerrit.wikimedia.org/r/949988

Mentioned in SAL (#wikimedia-operations) [2023-08-17T15:02:47Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:949568|suwikisource remove NamespaceAliases and ExtraNamespaces for Page and Index namespace (T344314)]], [[gerrit:949988|Growth: Temporarily disable link-recommendation FE on arwiki (T316079)]]

Mentioned in SAL (#wikimedia-operations) [2023-08-17T15:04:39Z] <urbanecm@deploy1002> urbanecm and anzx: Backport for [[gerrit:949568|suwikisource remove NamespaceAliases and ExtraNamespaces for Page and Index namespace (T344314)]], [[gerrit:949988|Growth: Temporarily disable link-recommendation FE on arwiki (T316079)]] synced to the testservers mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, and mw-debug kubernetes deployment (ac

Maintenance_bot removed a project: Patch-For-Review.Aug 17 2023, 3:10 PM

Mentioned in SAL (#wikimedia-operations) [2023-08-17T15:17:43Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:949568|suwikisource remove NamespaceAliases and ExtraNamespaces for Page and Index namespace (T344314)]], [[gerrit:949988|Growth: Temporarily disable link-recommendation FE on arwiki (T316079)]] (duration: 14m 56s)

Change 949990 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] revalidateLinkRecommendations: Load scoreLessThan correctly

https://gerrit.wikimedia.org/r/949990

gerritbot added a project: Patch-For-Review.Aug 17 2023, 3:25 PM

revalidateLinkRecommendations is completed on arwiki now -- all tasks now meet the new 0.6 threshold. I left a follow-up patch that can be merged any-time, and after that, we can consider this resolved.

Change 949585 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Revert "Growth: Temporarily disable link-recommendation FE on arwiki"

https://gerrit.wikimedia.org/r/949585

Change 949585 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert "Growth: Temporarily disable link-recommendation FE on arwiki"

https://gerrit.wikimedia.org/r/949585

Mentioned in SAL (#wikimedia-operations) [2023-08-21T11:23:42Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:949585|Revert "Growth: Temporarily disable link-recommendation FE on arwiki" (T316079)]]

Mentioned in SAL (#wikimedia-operations) [2023-08-21T11:25:15Z] <urbanecm@deploy1002> urbanecm: Backport for [[gerrit:949585|Revert "Growth: Temporarily disable link-recommendation FE on arwiki" (T316079)]] synced to the testservers mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)

Mentioned in SAL (#wikimedia-operations) [2023-08-21T11:32:25Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:949585|Revert "Growth: Temporarily disable link-recommendation FE on arwiki" (T316079)]] (duration: 08m 42s)

This should be finally done.

Change 949990 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] revalidateLinkRecommendations: Load scoreLessThan correctly

https://gerrit.wikimedia.org/r/949990

Maintenance_bot removed a project: Patch-For-Review.Aug 21 2023, 12:11 PM

Change 950812 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.41.0-wmf.22] revalidateLinkRecommendations: Load scoreLessThan correctly

https://gerrit.wikimedia.org/r/950812

gerritbot added a project: Patch-For-Review.Aug 21 2023, 12:41 PM

Change 950812 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.41.0-wmf.22] revalidateLinkRecommendations: Load scoreLessThan correctly

https://gerrit.wikimedia.org/r/950812

Mentioned in SAL (#wikimedia-operations) [2023-08-21T20:21:27Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:951151|Growth: Remove wgWelcomeSurveyEnableWithHomepage (T342353 T344619)]], [[gerrit:950812|revalidateLinkRecommendations: Load scoreLessThan correctly (T316079)]], [[gerrit:950813|LinkRecommendationUpdater: Load link-recommendation even if disabled (T344343)]]

Mentioned in SAL (#wikimedia-operations) [2023-08-21T20:23:01Z] <urbanecm@deploy1002> urbanecm: Backport for [[gerrit:951151|Growth: Remove wgWelcomeSurveyEnableWithHomepage (T342353 T344619)]], [[gerrit:950812|revalidateLinkRecommendations: Load scoreLessThan correctly (T316079)]], [[gerrit:950813|LinkRecommendationUpdater: Load link-recommendation even if disabled (T344343)]] synced to the testservers mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwde

Maintenance_bot removed a project: Patch-For-Review.Aug 21 2023, 8:30 PM

Mentioned in SAL (#wikimedia-operations) [2023-08-21T20:32:29Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:951151|Growth: Remove wgWelcomeSurveyEnableWithHomepage (T342353 T344619)]], [[gerrit:950812|revalidateLinkRecommendations: Load scoreLessThan correctly (T316079)]], [[gerrit:950813|LinkRecommendationUpdater: Load link-recommendation even if disabled (T344343)]] (duration: 11m 02s)

Mentioned in SAL (#wikimedia-operations) [2023-08-22T12:46:40Z] <urbanecm> mwmaint1002: foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/revalidateLinkRecommendations.php --scoreLessThan=0.6 --verbose | tee growth-T316079-revalidate-0.6.log # T316079

In T316079#9109164, @Stashbot wrote:

Mentioned in SAL (#wikimedia-operations) [2023-08-22T12:46:40Z] <urbanecm> mwmaint1002: foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/revalidateLinkRecommendations.php --scoreLessThan=0.6 --verbose | tee growth-T316079-revalidate-0.6.log # T316079

I've started the revalidation for all the wikis, to ensure the 0.6 threshold is met everywhere. I already ran it manually on most of the bigger wikis, but this should be done on the remainder as well, just in case.

Urbanecm_WMF mentioned this in T344686: linkrecommendation-internal-load-datasets pod is failing.Aug 22 2023, 12:56 PM

ReleaseTaggerBot added a project: MW-1.41-notes (1.41.0-wmf.25; 2023-09-05).Aug 29 2023, 7:02 PM

ReleaseTaggerBot edited projects, added MW-1.41-notes (1.41.0-wmf.22; 2023-08-15); removed MW-1.41-notes (1.41.0-wmf.25; 2023-09-05).Aug 29 2023, 8:01 PM

ReleaseTaggerBot edited projects, added MW-1.41-notes (1.41.0-wmf.25; 2023-09-05); removed MW-1.41-notes (1.41.0-wmf.22; 2023-08-15).Aug 29 2023, 11:01 PM

Etonkovidova closed this task as Resolved.Oct 3 2023, 12:20 AM

KStoller-WMF mentioned this in T367201: Community Configuration: Improve explanation for "underlinkedWeight" and "minimumLinkScore" in Suggested Edits form.Tue, Jun 11, 5:46 PM