Page MenuHomePhabricator

Add Link recommendation are not being processed by CirrusSearch (November 2024)
Closed, ResolvedPublic3 Estimated Story Points

Description

In the past days, Growth's maintenance job (refreshLinkRecommendations) generated several recommendations for es.wikipedia articles, see the following log snippet:

Nov  3 07:28:49 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s7[31579]: eswiki:      checking candidate Polinomios_de_Macdonald... success, updating index
Nov  3 07:28:52 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s7[31579]: eswiki:      checking candidate David_Cox_(estadístico)... success, updating index
Nov  3 07:29:00 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s7[31579]: eswiki:      checking candidate Josip_Plemelj... success, updating index
Nov  3 07:29:02 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s7[31579]: eswiki:      checking candidate Zebedee... success, updating index
Nov  3 07:29:04 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s7[31579]: eswiki:      checking candidate Walter_A._Shewhart... success, updating index
Nov  3 07:29:08 mwmaint2002 mediawiki_job_growthexperiments-refreshLinkRecommendations-s7[31579]: eswiki:      checking candidate Conjetura_de_Thurston... success, updating index

Those recommendations were not ingested into the Search index. I spot checked a couple of articles by doing pageid:9876419 hasrecommendation:link (replacing the number with the actual page ID).

Articles I spot checked:

  • Josip_Plemelj (9876419)
  • Walter_A._Shewhart (965303)
  • Conjetura_de_Thurston (11017047)

None of them are visible in search, despite the recommendation being generated more than 24 hours ago. Considering the code didn't error out at Growth's side, there appears to be a problem in the Search infrastructure.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'm seeing over 10.000 errors stemming from this in MediaWiki Logstash too, in the EventBus channel. The first event is from Oct 30, 2024 @ 14:19:03.375 (UTC, I think)

The stacktrace:

from /srv/mediawiki/php-1.43.0-wmf.28/extensions/EventBus/includes/EventBus.php(433)
#0 /srv/mediawiki/php-1.43.0-wmf.28/extensions/CirrusSearch/includes/EventBusWeightedTagsUpdater.php(68): MediaWiki\Extension\EventBus\EventBus->send(array)
#1 /srv/mediawiki/php-1.43.0-wmf.28/extensions/GrowthExperiments/includes/NewcomerTasks/AddLink/LinkRecommendationUpdater.php(145): CirrusSearch\EventBusWeightedTagsUpdater->updateWeightedTags(MediaWiki\Page\PageIdentityValue, string)
#2 /srv/mediawiki/php-1.43.0-wmf.28/extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php(291): GrowthExperiments\NewcomerTasks\AddLink\LinkRecommendationUpdater->processCandidate(MediaWiki\Title\Title, bool)
#3 /srv/mediawiki/php-1.43.0-wmf.28/extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php(160): GrowthExperiments\Maintenance\RefreshLinkRecommendations->processCandidate(MediaWiki\Title\Title, bool)
#4 /srv/mediawiki/php-1.43.0-wmf.28/maintenance/includes/MaintenanceRunner.php(703): GrowthExperiments\Maintenance\RefreshLinkRecommendations->execute()
#5 /srv/mediawiki/php-1.43.0-wmf.28/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#6 /srv/mediawiki/multiversion/MWScript.php(158): require_once(string)
#7 {main}
Gehel set the point value for this task to 3.Mon, Nov 4, 4:40 PM

i'm wondering... Is it reasonable to expect those events to be still processed somehow? Or given they never really got through, they'd need to be re-produced once the validation issue gets fixed?

I'm seeing over 10.000 errors stemming from this in MediaWiki Logstash too, in the EventBus channel. The first event is from Oct 30, 2024 @ 14:19:03.375 (UTC, I think)

[...]

@Etonkovidova already created a task for those last Wednesday: T378664: [wmf.1] refreshLinkRecommendations.php - Unable to deliver all events: 400: Bad Request

Playing around with the dashboard, I'm noticing that there seem to be no set-events for recommendation_image and recommendation_image_section either.
So maybe those are affected as well? But maybe I'm misunderstanding how they work.

i'm wondering... Is it reasonable to expect those events to be still processed somehow? Or given they never really got through, they'd need to be re-produced once the validation issue gets fixed?

Unfortunately while we could reprocess them if they landed in kafka, since they were rejected by validation prior to that I'm not certain we have anything we could rerun them from.

Playing around with the dashboard, I'm noticing that there seem to be no set-events for recommendation_image and recommendation_image_section either.
So maybe those are affected as well? But maybe I'm misunderstanding how they work.

I believe that these ones are handled by a spark job of the SD team, these are not yet using the Search Update Pipeline and thus not covered by this metric, @pfischer is working on unifying all these sources so hopefully we should see those in this graph at some point.

Change #1087407 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] CirrusSearch: Disable updating weighted tags via EventBus

https://gerrit.wikimedia.org/r/1087407

Change #1087432 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[mediawiki/extensions/CirrusSearch@master] Fix WeightedTagsUpdater

https://gerrit.wikimedia.org/r/1087432

Change #1087407 merged by jenkins-bot:

[operations/mediawiki-config@master] CirrusSearch: Disable updating weighted tags via EventBus

https://gerrit.wikimedia.org/r/1087407

Mentioned in SAL (#wikimedia-operations) [2024-11-05T12:16:09Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1087407|CirrusSearch: Disable updating weighted tags via EventBus (T378983 T377150)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-05T12:23:49Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1087407|CirrusSearch: Disable updating weighted tags via EventBus (T378983 T377150)]] (duration: 07m 39s)

Mentioned in SAL (#wikimedia-operations) [2024-11-05T12:33:02Z] <urbanecm> mwmaint2002: kill all instances of refreshLinkRecommendation (T378983)

Mentioned in SAL (#wikimedia-operations) [2024-11-05T12:33:27Z] <urbanecm> eswiki,x1: delete from growthexperiments_link_recommendations where gelr_page=10598298; (to verify updates are flowing in; T378983)

Playing around with the dashboard, I'm noticing that there seem to be no set-events for recommendation_image and recommendation_image_section either.
So maybe those are affected as well? But maybe I'm misunderstanding how they work.

I believe that these ones are handled by a spark job of the SD team, these are not yet using the Search Update Pipeline and thus not covered by this metric, @pfischer is working on unifying all these sources so hopefully we should see those in this graph at some point.

But ... they have data showing in that graph, at least for the delete-action, and that data went back to 0 as the switch was flipped with the changes above. Though it might be the cases that adding and deleting these weighted tags is being handled in different ways, and that in practice T377150 only affected the delete-action for now.

image.png (708×590 px, 57 KB)

Hi @pfischer (and fyi @Gehel), as far as we can see, the suggestions are now flowing again (yay!). The dangling records Grafana chart stopped growing as well, which is good too.

Thanks to the stopgap (reverting T377150), the original task description is no longer accurate. Where should the remainder of the work be tracked (asking so that we know what task(s) to watch for updates)? As far as I know, the following tasks exist:

But ... they have data showing in that graph, at least for the delete-action, and that data went back to 0 as the switch was flipped with the changes above. Though it might be the cases that adding and deleting these weighted tags is being handled in different ways, and that in practice T377150 only affected the delete-action for now.

Correct, adding image recommendation flags is handled by a spark jobs that does not yet interact with the Search Update Pipeline and thus not visible in this dashboard. Clearing these image recommendation tags is handled on the other hand by the GrowthExperiments extension when e.g. invalidating them and stopped being visible in this graph when T377150 was reverted.

Change #1087432 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Fix WeightedTagsUpdater

https://gerrit.wikimedia.org/r/1087432

Change #1089230 had a related patch set uploaded (by Urbanecm; author: Peter Fischer):

[mediawiki/extensions/CirrusSearch@wmf/1.44.0-wmf.2] Fix WeightedTagsUpdater

https://gerrit.wikimedia.org/r/1089230

Change #1089826 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[operations/mediawiki-config@master] CirrusSearch: re-enable offloading weighted tags via EventBus

https://gerrit.wikimedia.org/r/1089826

Change #1089230 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@wmf/1.44.0-wmf.2] Fix WeightedTagsUpdater

https://gerrit.wikimedia.org/r/1089230

Mentioned in SAL (#wikimedia-operations) [2024-11-12T08:21:00Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1089230|Fix WeightedTagsUpdater (T378664 T378983)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-12T08:28:00Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1089230|Fix WeightedTagsUpdater (T378664 T378983)]] (duration: 06m 59s)

Change #1089826 merged by jenkins-bot:

[operations/mediawiki-config@master] CirrusSearch: re-enable offloading weighted tags via EventBus

https://gerrit.wikimedia.org/r/1089826

Mentioned in SAL (#wikimedia-operations) [2024-11-12T08:36:29Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1089826|CirrusSearch: re-enable offloading weighted tags via EventBus (T378983)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-12T08:38:36Z] <urbanecm@deploy2002> pfischer, urbanecm: Backport for [[gerrit:1089826|CirrusSearch: re-enable offloading weighted tags via EventBus (T378983)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

@pfischer and I attempted to re-deploy the EventBus approach, but were unsuccessful at doing so. Details are at T377150#10311378.

Change #1090455 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] [CirrusSearch] testwiki: enable offloading weighted tags via EventBus

https://gerrit.wikimedia.org/r/1090455

Change #1090462 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[operations/mediawiki-config@master] CirrusSearch: enable offloading weighted tags via EventBus for testwiki

https://gerrit.wikimedia.org/r/1090462

Change #1090462 abandoned by Peter Fischer:

[operations/mediawiki-config@master] CirrusSearch: enable offloading weighted tags via EventBus for testwiki

Reason:

duplicate

https://gerrit.wikimedia.org/r/1090462

Change #1090455 merged by jenkins-bot:

[operations/mediawiki-config@master] [CirrusSearch] testwiki: enable offloading weighted tags via EventBus

https://gerrit.wikimedia.org/r/1090455

Mentioned in SAL (#wikimedia-operations) [2024-11-12T13:58:53Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1090455|[CirrusSearch] testwiki: enable offloading weighted tags via EventBus (T378983)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-12T14:02:56Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1090455|[CirrusSearch] testwiki: enable offloading weighted tags via EventBus (T378983)]]

Change #1090480 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Revert "[CirrusSearch] testwiki: enable offloading weighted tags via EventBus"

https://gerrit.wikimedia.org/r/1090480

Change #1090480 merged by Urbanecm:

[operations/mediawiki-config@master] Revert "[CirrusSearch] testwiki: enable offloading weighted tags via EventBus"

https://gerrit.wikimedia.org/r/1090480

Change #1090550 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[operations/mediawiki-config@master] Revert^2 "[CirrusSearch] testwiki: enable offloading weighted tags via EventBus"

https://gerrit.wikimedia.org/r/1090550

Change #1090550 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert^2 "[CirrusSearch] testwiki: enable offloading weighted tags via EventBus"

https://gerrit.wikimedia.org/r/1090550

Mentioned in SAL (#wikimedia-operations) [2024-11-12T21:02:06Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1090550|Revert^2 "[CirrusSearch] testwiki: enable offloading weighted tags via EventBus" (T378983)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-12T21:09:24Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1090550|Revert^2 "[CirrusSearch] testwiki: enable offloading weighted tags via EventBus" (T378983)]] (duration: 07m 18s)

Per @pfischer's request, I enabled EventBus at testwiki only. After syncing that change to production, I also triggered two updates (one set, one clear) for the following pages:

  • clear: Titanium, page ID 119415
    • Triggered by a plain edit
  • set: Nobelium, page ID 119493
    • Triggered via mwscript extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php --wiki=testwiki --page=Nobelium --force --verbose
    • NOTE: I used the same page during the (successful) test in the morning, when I issued a clear on that page

After issuing both updates, I checked the following Search queries to see the effect:

  • pageid:119415 hasrecommendation:link (expected: no result)
    • I double checked this query produced one result before starting with the test
  • pageid:119493 hasrecommendation:link (expected: single result)

After several minutes, the first query produced the expected result. The second one still does not have the recommendation, despite it being generated.

CLI logs
[urbanecm@mwmaint2002 ~]$ mwscript extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php --wiki=testwiki --page=Nobelium --force --verbose
DEPRECATION WARNING: Maintenance scripts are moving to Kubernetes. See
https://wikitech.wikimedia.org/wiki/Maintenance_scripts for the new process.
Maintenance hosts will be going away; please submit feedback promptly if
maintenance scripts on Kubernetes don't work for you. (T341553)
Refreshing link recommendations...
    checking candidate Nobelium... success, updating index
[urbanecm@mwmaint2002 ~]$
Screencast

While testing the clear, I recorded my screen. Screenshot is available at https://people.wikimedia.org/~urbanecm/onetime/jVQA.webm.

After several minutes, the first query produced the expected result. The second one still does not have the recommendation, despite it being generated.

Now it does... Maybe something needed time (and/or the code to be fully in prod, rather than just at mwdebug)?

The config change only affects the host it's been rolled out to. So as long as the host you run your script on has the change events are produced. From there on 1) our pipeline picks it up and produces new intermediate update events which are 2) ingested by the second part of our pipeline that 3) requests updates from elastic. The only major delay, of ~10', may be introduced by step 1). On the way to step 2) kafka mirror maker might introduce another delay, but that should be minimal.

I'll check the update times of the documents in question…

So far, all tests seem good. Let's verify with a bunch of new articles (as that is what was failing previously) in a systematic way. I followed this:

  1. Generated a set of random simplewiki articles with link recommendation that do not exist at testwiki (used the growthtasks API generator, query); results are at P71062
  2. Exported those pages from simplewiki via Special:Export and imported via Special:Import to testwiki
  3. For each page, I ran refreshLinkRecommendations.php (commands are at P71064)
  4. I waited ~10 minutes
  5. For all pages, a link recommendation should be present.

Mentioned in SAL (#wikimedia-operations) [2024-11-18T11:41:00Z] <urbanecm> mwmaint2002: Run extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php at testwiki for a bunch of pages (P71064 is list of commands executed; T378983)

Page IDs of the pages in question:

1page_title page_id
2Affine_arithmetic 160619
3And_That_Is_Why_._._._Manipuri_Myths_Retold 160623
4Artemis_Fowl_(novel) 160631
5Carrier-sense_multiple_access_with_collision_avoidance 160610
6Complementary_currency 160621
7Control_panel_(engineering) 160608
8Derecho_and_tornado_outbreak_of_April_4–5,_2011 160598
9Drag_racing 160604
10El_Gato_Negro 160630
11Ever_Ace 160625
12Flash_flood 160596
13Gillidanda 160629
14Gliese_581_c 160614
15Gosei_(fifth-generation_Nikkei) 160607
16Grammatical_gender 160617
17Gross_domestic_product 160585
18Homosociality 160593
19Houmets 160600
20Infiorata_di_Genzano 160615
21Inherent_vowel 160626
22Interlingua_grammar 160603
23John_McDouall_Stuart 160597
24Lake_Ülemiste 160602
25Lightning_rod 160591
26List_of_Death_in_Paradise_characters 160628
27List_of_rulers_of_Belarus 160622
28Longboat 160587
29Manyogana 160613
30Mary_Somerville 160612
31Mbuyisa_Makhubo 160627
32Mihailo_Đurić 160609
33Mochi 160589
34Ordered_pair 160632
35Pandita_Ramabai 160594
36Pentagon_Force_Protection_Agency 160601
37Perfect_competition 160586
38Persian_Gulf_naming_dispute 160618
39Productive_efficiency 160611
40Rommel_Roberts 160616
41SK_Admira_Wien 160599
42Square-free_integer 160624
43Standing_start 160605
44Storage_area_network 160592
45Structuring 160620
46Supply-side_economics 160595
47The_Tell-Tale_Heart 160588
48The_Terminator 160606
49Thin_film_transistor_liquid_crystal_display 160590

Starting from the P71065 paste above, I wrote a oneliner in bash to verify the link recommendations are in Search as expected:

[urbanecm@mwmaint2002 ~]$ cut -f 2 < pageids.txt | sed 1d | while read id; do data=$(curl --data-urlencode "gsrsearch=hasrecommendation:link pageid:$id" -s 'https://test.wikipedia.org/w/api.php?action=query&format=json&prop=info&generator=search&formatversion=2' | jq '.query.pages | length'); echo -e "$id\t$data"; done
160619  0
160623  0
160631  0
160610  0
160621  0
160608  0
160598  0
160604  0
160630  0
160625  0
160596  0
160629  0
160614  0
160607  0
160617  0
160585  0
160593  0
160600  0
160615  0
160626  0
160603  0
160597  0
160602  0
160591  0
160628  0
160622  0
160587  0
160613  0
160612  0
160627  0
160609  0
160589  0
160632  0
160594  0
160601  0
160586  0
160618  0
160611  0
160616  0
160599  0
160624  0
160605  0
160592  0
160620  0
160595  0
160588  0
160606  0
160590  0
[urbanecm@mwmaint2002 ~]$

On each line, it prints page ID followed by the number of link recommendations it has (either 1 or 0). Right now, none of the pages have any, which is expected, as it's not been 10+ minutes.

[...] Right now, none of the pages have any, which is expected, as it's not been 10+ minutes.

I think that's false. For that 10min delay to happen, the fourth parameter for \CirrusSearch\WeightedTagsUpdater::updateWeightedTags needs to be set to 'revision' (the literal string), but that is not the case in LinkRecommendationUpdater::processCandidate (and should not be, as it was not triggered by creating a new revision). I guess this delay has a different cause, or it is there due to a bug.

That being said, I can now see something happening on https://grafana.wikimedia.org/d/fe251f4f-f6cf-4010-8d78-5f482255b16f/cirrussearch-update-pipeline-weighted-tags?orgId=1&var-tag_prefix=recommendation_link&var-search_cluster_site=codfw&var-search_cluster=consumer-search

Change #1092235 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] LinkRecommendationUpdater: Log failures to update weighted tags

https://gerrit.wikimedia.org/r/1092235

Good catch @Michael! That makes this even weirder. However, I can't get the pages to show even when searching for pageid:160593 (w/o any hasrecommendation: condition). I think this is because pages created via Special:Import are ignored by CirrusSearch... (@pfischer @dcausse is that the case?).

Let's make an edit to each page, to ensure it gets registered normally. Then, I can run the script again (DB row would be dropped on the edit) and hopefully actually see some results.

Mentioned in SAL (#wikimedia-operations) [2024-11-18T13:16:02Z] <urbanecm> mwmaint2002: Run extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php at testwiki for a bunch of pages (P71064 is list of commands executed; T378983)

Okay, after making a manual edit via PAWS, all pages are recorded in Search:

[urbanecm@mwmaint2002 ~]$ cut -f 2 < pageids.txt | sed 1d | while read id; do data=$(curl --data-urlencode "gsrsearch=pageid:$id" -s 'https://test.wikipedia.org/w/api.php?action=query&format=json&prop=info&generator=search&formatversion=2' | jq '.query.pages | length'); echo -e "$id\t$data"; done
160619  1
160623  1
160631  1
160610  1
160621  1
160608  1
160598  1
160604  1
160630  1
160625  1
160596  1
160629  1
160614  1
160607  1
160617  1
160585  1
160593  1
160600  1
160615  1
160626  1
160603  1
160597  1
160602  1
160591  1
160628  1
160622  1
160587  1
160613  1
160612  1
160627  1
160609  1
160589  1
160632  1
160594  1
160601  1
160586  1
160618  1
160611  1
160616  1
160599  1
160624  1
160605  1
160592  1
160620  1
160595  1
160588  1
160606  1
160590  1
[urbanecm@mwmaint2002 ~]$

Let's try again

Okay, so far, no recommendations:

[urbanecm@mwmaint2002 ~]$ cut -f 2 < pageids.txt | sed 1d | while read id; do data=$(curl --data-urlencode "gsrsearch=hasrecommendation:link pageid:$id" -s 'https://test.wikipedia.org/w/api.php?action=query&format=json&prop=info&generator=search&formatversion=2' | jq '.query.pages | length'); echo -e "$id\t$data"; done                                                                                                                                                                         
160619  0
160623  0
160631  0
160610  0
160621  0
160608  0
160598  0
160604  0
160630  0
160625  0
160596  0
160629  0
160614  0
160607  0
160617  0
160585  0
160593  0
160600  0
160615  0
160626  0
160603  0
160597  0
160602  0
160591  0
160628  0
160622  0
160587  0
160613  0
160612  0
160627  0
160609  0
160589  0
160632  0
160594  0
160601  0
160586  0
160618  0
160611  0
160616  0
160599  0
160624  0
160605  0
160592  0
160620  0
160595  0
160588  0
160606  0
160590  0
[urbanecm@mwmaint2002 ~]$

Let's give it a few and run the test again.

Okay, now all recommendations are ingested:

[urbanecm@mwmaint2002 ~]$ cut -f 2 < pageids.txt | sed 1d | while read id; do data=$(curl --data-urlencode "gsrsearch=hasrecommendation:link pageid:$id" -s 'https://test.wikipedia.org/w/api.php?action=query&format=json&prop=info&generator=search&formatversion=2' | jq '.query.pages | length'); echo -e "$id\t$data"; done
160619  1
160623  1
160631  1
160610  1
160621  1
160608  1
160598  1
160604  1
160630  1
160625  1
160596  1
160629  1
160614  1
160607  1
160617  1
160585  1
160593  1
160600  1
160615  1
160626  1
160603  1
160597  1
160602  1
160591  1
160628  1
160622  1
160587  1
160613  1
160612  1
160627  1
160609  1
160589  1
160632  1
160594  1
160601  1
160586  1
160618  1
160611  1
160616  1
160599  1
160624  1
160605  1
160592  1
160620  1
160595  1
160588  1
160606  1
160590  1
[urbanecm@mwmaint2002 ~]$

Let's save edits on all those pages again. That should clear the recommendations. If that happens, then the EventBus pipeline appears to be working.

Okay, all suggestions were now dropped:

[urbanecm@mwmaint2002 ~]$ cut -f 2 < pageids.txt | sed 1d | while read id; do data=$(curl --data-urlencode "gsrsearch=hasrecommendation:link pageid:$id" -s 'https://test.wikipedia.org/w/api.php?action=query&format=json&prop=info&generator=search&formatversion=2' | jq '.query.pages | length'); echo -e "$id\t$data"; done
160619  0
160623  0
160631  0
160610  0
160621  0
160608  0
160598  0
160604  0
160630  0
160625  0
160596  0
160629  0
160614  0
160607  0
160617  0
160585  0
160593  0
160600  0
160615  0
160626  0
160603  0
160597  0
160602  0
160591  0
160628  0
160622  0
160587  0
160613  0
160612  0
160627  0
160609  0
160589  0
160632  0
160594  0
160601  0
160586  0
160618  0
160611  0
160616  0
160599  0
160624  0
160605  0
160592  0
160620  0
160595  0
160588  0
160606  0
160590  0
[urbanecm@mwmaint2002 ~]$

Seems like EventBus is working after all. Great! With that, I think we can resolve this task. Re-enabling the config flag is tracked in T377150: Config: enable CirrusSearchEnableEventBusWeightedTags .

Change #1092258 had a related patch set uploaded (by Peter Fischer; author: Peter Fischer):

[operations/mediawiki-config@master] CirrusSearch: enable offloading weighted tags via EventBus for testwiki

https://gerrit.wikimedia.org/r/1092258

Change #1092235 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] LinkRecommendationUpdater: Log failures to update weighted tags

https://gerrit.wikimedia.org/r/1092235

Change #1092258 merged by jenkins-bot:

[operations/mediawiki-config@master] CirrusSearch: enable offloading weighted tags via EventBus

https://gerrit.wikimedia.org/r/1092258

Mentioned in SAL (#wikimedia-operations) [2024-11-19T08:04:34Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1082726|Translate Event Logging: Enable using $wgTranslateEnableEventLogging (T364460)]], [[gerrit:1092258|CirrusSearch: enable offloading weighted tags via EventBus (T378983 T377150)]], [[gerrit:1091197|[GrowthExperiments] Add virtual domain config (T354939)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-19T08:12:12Z] <urbanecm@deploy2002> urbanecm, wangombe, pfischer: Backport for [[gerrit:1082726|Translate Event Logging: Enable using $wgTranslateEnableEventLogging (T364460)]], [[gerrit:1092258|CirrusSearch: enable offloading weighted tags via EventBus (T378983 T377150)]], [[gerrit:1091197|[GrowthExperiments] Add virtual domain config (T354939)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-11-19T08:29:16Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1082726|Translate Event Logging: Enable using $wgTranslateEnableEventLogging (T364460)]], [[gerrit:1092258|CirrusSearch: enable offloading weighted tags via EventBus (T378983 T377150)]], [[gerrit:1091197|[GrowthExperiments] Add virtual domain config (T354939)]] (duration: 24m 42s)