Page MenuHomePhabricator

fixLinkRecommendationData.php does not run when link-recommendation task type is disabled
Closed, ResolvedPublic

Description

As part of following up on T372333: de.wikipedia: Add Link unavailable due to a high-number of dangling records, I wanted to check on the number of dangling records at de.wikipedia. However, this doesn't seem to be possible, as fixLinkRecommendationData.php script requires link-recommendation task type to be enabled:

[urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=dewiki --dry-run --search-index --verbose | grep 'Would fix' | wc -l
'link-recommendation' is not a link recommendation task type
0
[urbanecm@mwmaint1002 ~]$

We should remove this requirement, similar to how we changed refreshLinkRecommendation.php in T371316: refreshLinkRecommendations.php does not run when link-recommendation task type is disabled.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

We should probably fix this and determine the counts prior to enabling Add Link again, so that we can be sure the task would work. This would also (re)add those numbers to the Grafana board.

This is a good task to fix this properly, but see T372333#10085138 🙂

Oh, very well then :). Thanks for the info! We should do this regardless, so the numbers are in Grafana et al, but not necessarily before Add Link gets re-enabled at de.

Change #1063841 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/GrowthExperiments@master] Run fixLinkRecommendationData even when disabled in CC

https://gerrit.wikimedia.org/r/1063841

Michael moved this task from Inbox to Current Sprint on the Growth-Team board.
Michael edited projects, added Growth-Team (Current Sprint); removed Growth-Team.
Michael moved this task from Incoming to Code Review on the Growth-Team (Current Sprint) board.

Change #1063841 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Run fixLinkRecommendationData even when disabled in CC

https://gerrit.wikimedia.org/r/1063841

Change #1079925 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/GrowthExperiments@wmf/1.43.0-wmf.26] Run fixLinkRecommendationData even when disabled in CC

https://gerrit.wikimedia.org/r/1079925

Change #1079925 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.43.0-wmf.26] Run fixLinkRecommendationData even when disabled in CC

https://gerrit.wikimedia.org/r/1079925

Mentioned in SAL (#wikimedia-operations) [2024-10-14T14:02:42Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:1079923|refactor(tests): don't use per-method coverage annotation]], [[gerrit:1079894|refactor(HomepageHooks): extract method for simpler modifyability]], [[gerrit:1079915|Clear LinkRecommendation suggestions on page save (T364341 T372337)]], [[gerrit:1079925|Run fixLinkRecommendationData even when disabled in CC (T373176)]]

Mentioned in SAL (#wikimedia-operations) [2024-10-14T14:04:49Z] <lucaswerkmeister-wmde@deploy2002> migr, lucaswerkmeister-wmde: Backport for [[gerrit:1079923|refactor(tests): don't use per-method coverage annotation]], [[gerrit:1079894|refactor(HomepageHooks): extract method for simpler modifyability]], [[gerrit:1079915|Clear LinkRecommendation suggestions on page save (T364341 T372337)]], [[gerrit:1079925|Run fixLinkRecommendationData even when disabled in CC (T373176)]] synced to

Mentioned in SAL (#wikimedia-operations) [2024-10-14T14:09:31Z] <lucaswerkmeister-wmde@deploy2002> Finished scap sync-world: Backport for [[gerrit:1079923|refactor(tests): don't use per-method coverage annotation]], [[gerrit:1079894|refactor(HomepageHooks): extract method for simpler modifyability]], [[gerrit:1079915|Clear LinkRecommendation suggestions on page save (T364341 T372337)]], [[gerrit:1079925|Run fixLinkRecommendationData even when disabled in CC (T373176)]] (duration: 0

QA Note: I'm not sure how to QA this task, but one way to see its effects is to look at the panels for the dangling records and now also see dots for the wikis that have this task disabled in CommunityConfiguration.

(Note that I have adjusted the Grafana panel to also show us always the dots for a data-point, not only the connecting lines. This highlights where we are missing data. It is still unclear why that happens.)

(Note 2: there might be more dots than usual for eswiki and frwiki, that comes from the work for T372337)

Etonkovidova subscribed.

QA Note: I'm not sure how to QA this task, but one way to see its effects is to look at the panels for the dangling records and now also see dots for the wikis that have this task disabled in CommunityConfiguration.

(Note that I have adjusted the Grafana panel to also show us always the dots for a data-point, not only the connecting lines. This highlights where we are missing data. It is still unclear why that happens.)

(Note 2: there might be more dots than usual for eswiki and frwiki, that comes from the work for T372337)

Thanks, @Michael! I looked at the grafana dashboards and the numbers for dangling db records/index is still high.

Dangling DB record:enwiki (195K), arwiki(49.8K), and dewiki (60.4K)
Dangling search index records: eswiki(54K) and frwiki (49K) .
I checked for link-recommendation on eswiki wmf.26 and frwiki wmf.26 - the link-recommendation suggestions are not available on Special:Homepage.

Questions:

[...]
Dangling DB record:enwiki (195K), arwiki(49.8K), and dewiki (60.4K)
Dangling search index records: eswiki(54K) and frwiki (49K) .

For eswiki, when looking at the last few days, you should see things (slowly) improving. Especially when compared to frwiki for which we did not yet change the behavior and only increased the frequency with which we record the number of dangling records.

[...]
Questions:

  • the number of dangling records on grafana boards should decrease as a result of the script run? Or it's out of the scope of this task?

Not via what this task is about in particular. We also use this script in dry-run mode to report the numbers of dangling records (without doing anything about them). And that is what can be verified by looking at enwiki or ruwiki or arwiki, all of which have the structured add-a-link task currently disabled.

No, I don't think so. This task here is just about the script working on wikis where the task is available but disabled. That it does. Fixing the dangling records is a different problem that we can address in T372337.

Thanks, @Michael - that clarifies a lot. The scope of the task is done - Resolved.