In order to stabilize the Wikidata Query Service we are looking into splitting the graph inside Blazegraph into 2 (or potentially more) subgraphs. This ticket is for tracking the investigation into what a sensible split would be, what the consequences are and then making it happen.
Description
Details
Event Timeline
Mentioned in SAL (#wikimedia-operations) [2024-03-05T22:37:10Z] <bking@cumin2002> START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on wdqs[1022-1024].eqiad.wmnet with reason: T337013
Mentioned in SAL (#wikimedia-operations) [2024-03-05T22:37:14Z] <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on wdqs[1022-1024].eqiad.wmnet with reason: T337013
Mentioned in SAL (#wikimedia-operations) [2024-03-07T18:22:18Z] <bking@cumin2002> START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on wdqs[1022-1025].eqiad.wmnet with reason: T337013
Mentioned in SAL (#wikimedia-operations) [2024-03-07T18:22:38Z] <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on wdqs[1022-1025].eqiad.wmnet with reason: T337013
Will this also solve T261764? or is the graph used by the Query Service different than the one used by the API?
I don't believe this is using the Query Service. This means it would not be affected.
Mentioned in SAL (#wikimedia-operations) [2024-08-23T15:52:57Z] <bking@cumin2002> START - Cookbook sre.hosts.downtime for 17:00:00 on wdqs[1023-1024].eqiad.wmnet with reason: noisy alerts related to graph split T337013
Mentioned in SAL (#wikimedia-operations) [2024-08-23T15:53:13Z] <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 17:00:00 on wdqs[1023-1024].eqiad.wmnet with reason: noisy alerts related to graph split T337013
Change #1122151 had a related patch set uploaded (by Bking; author: Bking):
[operations/mediawiki-config@master] wdqs-categories: use new split graph hosts (wdqs-main) for categories
Change #1124535 had a related patch set uploaded (by Ryan Kemper; author: Ryan Kemper):
[operations/mediawiki-config@master] wdqs categories: switch to internal-main
Change #1122151 merged by Ryan Kemper:
[operations/mediawiki-config@master] wdqs-categories: remove extraneous wgCirrusSearchCategoryEndpoint value
Change #1124535 merged by jenkins-bot:
[operations/mediawiki-config@master] wdqs categories: switch to internal-main
Mentioned in SAL (#wikimedia-operations) [2025-03-25T20:26:26Z] <ryankemper@deploy1003> Started scap sync-world: Backport for [[gerrit:1124535|wdqs categories: switch to internal-main (T375520 T385896 T337013)]]
Mentioned in SAL (#wikimedia-operations) [2025-03-25T20:33:11Z] <ryankemper@deploy1003> ryankemper: Backport for [[gerrit:1124535|wdqs categories: switch to internal-main (T375520 T385896 T337013)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2025-03-25T20:48:06Z] <ryankemper@deploy1003> Finished scap sync-world: Backport for [[gerrit:1124535|wdqs categories: switch to internal-main (T375520 T385896 T337013)]] (duration: 21m 40s)