Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host db1229.eqiad.wmnet with OS bullseye executed with errors:
- db1229 (FAIL)
- The reimage failed, see the cookbook logs for the details
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin2002 for host db1229.eqiad.wmnet with OS bullseye executed with errors:
run a load-test on the deployed enwiki-goodfaith in staging before and after the upgrade and the results are almost the same
with kserve 0.10
isaranto@deploy2002:~/load_testing$ wrk -c 1 -t 1 --timeout 2s -s revscoring.lua https://inference-staging.svc.codfw.wmnet:30443/v1/models/enwiki-goodfaith:predict --latency -d 60 -- enwiki.input thread 1 created logfile wrk_1.log created Running 1m test @ https://inference-staging.svc.codfw.wmnet:30443/v1/models/enwiki-goodfaith:predict 1 threads and 1 connections Thread Stats Avg Stdev Max +/- Stdev Latency 298.94ms 182.86ms 973.92ms 84.85% Req/Sec 3.26 2.05 10.00 81.13% Latency Distribution 50% 226.40ms 75% 344.60ms 90% 537.70ms 99% 973.92ms 106 requests in 1.00m, 37.23KB read Socket errors: connect 0, read 0, write 0, timeout 7 Requests/sec: 1.76 Transfer/sec: 634.45B thread 1 made 108 requests and got 106 responses
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host db1229.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host db1228.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host db1227.eqiad.wmnet with OS bullseye
Thanks!
And yeah I now see that we already have it in the task description. @Arian_Bozorg Let's look into it.
In T336042#9215049, @BTullis wrote:What about running the sre.druid.roll-restart-workers cookbook on this cluster, so that it restarts the processes? I think that this is more likely to make sure that the new hosts joins the cluster than adding it to LVS.
In fact I'm pretty sure that adding it to LVS would be detrimental until druid1009 has got access to the same data as the rest of the cluster.
Change 959978 merged by jenkins-bot:
[mediawiki/extensions/ReplaceText@master] Add unit test for Search class
Seems like there was a deployment to production 5 days ago, probably by mistake while working on T345857. I will revert it straight away.
Mentioned in SAL (#wikimedia-operations) [2023-10-02T13:30:36Z] <taavi@deploy2002> taavi and dreamyjazz: Backport for [[gerrit:962612|Add 'testwikis' DB list to MWMultiVersion::DB_LISTS (T341110)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Re-opening this as the place to track other databases ProQuest offers that communities are looking for.
Mentioned in SAL (#wikimedia-operations) [2023-10-02T13:29:17Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:962612|Add 'testwikis' DB list to MWMultiVersion::DB_LISTS (T341110)]]
Change 962612 merged by jenkins-bot:
[operations/mediawiki-config@master] Add 'testwikis' DB list to MWMultiVersion::DB_LISTS
Change 962612 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):
[operations/mediawiki-config@master] Add 'testwikis' DB list to MWMultiVersion::DB_LISTS
Thanks for the update.
Change 962611 had a related patch set uploaded (by Fabfur; author: Fabfur):
[operations/puppet@production] purged: parametrize purged frontend and backend address
@achou @MunizaA thanks a lot! One nit - the paste outlined in the task's description is editable, so in theory anybody can tamper with it (everything is logged but it may be not straightforward to check for ML etc..). I would personally suggest to add the sha512 in a separate phab comment, that is not editable if not by the user (in theory).
@Anoop @Ammarpad We already discussed this as part of the edit-a-thon. So this should be ready to go. Thanks
This seems to be live already and broke @Mike_Peel's latest upload it seems :( https://mismatch-finder.toolforge.org/store/imports
And it looks like we just added the type to the end of the upload CSV judging from the error message? That seems suboptimal. I would expect it in second place.
Mentioned in SAL (#wikimedia-operations) [2023-10-02T13:19:54Z] <taavi@deploy2002> taavi and dreamyjazz: Backport for [[gerrit:962591|clienthints: Enable display on testwikis and four production wikis (T341110)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Thanks a lot @Trizek-WMF !
All disruptive switchover-related work is finished and things are stable. The switchover went smoothly and had minimal user impact, while also uncovering issues that we need to know about in order to improve our infrastructure and processes. The read-only period lasted 2min 22s.
Change 932374 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Update revision details requests to formatversion=2
Change 962610 had a related patch set uploaded (by Fomafix; author: Fomafix):
[mediawiki/extensions/PageLanguage@master] Replace deprecated wfGetDB( DB_REPLICA )
Change 962607 had a related patch set uploaded (by Pwangai; author: Pwangai):
[integration/config@master] Zuul: [mediawiki/extensions/TwoColConflict] Enable Sonar Codehealth
Change 961818 merged by Ssingh:
[operations/puppet@production] dnsbox: add ntp.anycast.wmnet as the anycasted NTP address
Change 962606 had a related patch set uploaded (by Pwangai; author: Pwangai):
[integration/config@master] Zuul: [mediawiki/extensions/UrlShortener] Enable Sonar Codehealth
Mentioned in SAL (#wikimedia-operations) [2023-10-02T13:11:05Z] <taavi@deploy2002> Started scap: Backport for [[gerrit:962591|clienthints: Enable display on testwikis and four production wikis (T341110)]]
Change 962605 had a related patch set uploaded (by Pwangai; author: Pwangai):
[integration/config@master] Zuul: [mediawiki/extensions/ProofreadPage] Enable Sonar Codehealth
Thanks @MunizaA for adding the sha512 checksum for the new model binary in the task description. I have verified it and confirmed the integrity of the file that we uploaded to Swift. In the future, we will do this step before uploading to make sure the file wasn't tampered with or miscopied. :)
Change 962591 merged by jenkins-bot:
[operations/mediawiki-config@master] clienthints: Enable display on testwikis and four production wikis
Change 962604 had a related patch set uploaded (by Pwangai; author: Pwangai):
[integration/config@master] Zuul: [mediawiki/skins/Nostalgia] Enable Sonar Codehealth
This has now been corrected