Page MenuHomePhabricator

"Expectation not met" warnings during SUL autologin autocreation
Closed, ResolvedPublic

Description

The logs have a bunch of warnings like

Expectation (masterConns <= 0) by MediaWiki\MediaWikiEntryPoint::restInPeace not met (actual: 1):

and

Expectation (writes <= 0) by MediaWiki\MediaWikiEntryPoint::restInPeace not met (actual: 1):

for various code paths (watchlist, site stats, Echo); about 20k / day. (logstash)

Screenshot Capture - 2025-03-06 - 21-36-29.png (578×1 px, 62 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I thought rOPUP3a729e42cc58: Update CentralAuth multi-DC rules for SUL3, attempt 2 would fix this but seems like it doesn't, even though I can verify that the URLs seen in Logstash are forced to the primary DC now, e.g.

$ curl -vo/dev/null 'https://es.wikipedia.org/wiki/?do-not-cache=123' |& ack 'x-cache|server'
* ALPN: server accepted h2
< server: mw-web.eqiad.main-58fdbf89d-d4jpv
< x-cache: cp3070 miss, cp3070 miss
< x-cache-status: miss
< server-timing: cache;desc="miss", host;desc="cp3070"

tgr@tgr-ThinkPad:~$ curl -vo/dev/null 'https://es.wikipedia.org/wiki/Special:CentralAutoLogin/setCookies?returnUrlToken=<redacted>&type=redirect&usesul3=0' |& ack 'x-cache|server'
* ALPN: server accepted h2
< server: mw-web.codfw.main-7dffb485d6-n8qjf
< x-cache: cp3070 miss, cp3070 pass
< x-cache-status: pass
< server-timing: cache;desc="pass", host;desc="cp3070"

Per IRC discussion with @Krinkle, we should just apply the same warning silencing workaround here that we do for e.g. for returning from remote authentication. It's not perfect (ideally we wouldn't want to suppress warnings that are not autocreation-related), but avoids overcomplicating things and it is good enough (autocreations on GET are essentially random, so it won't deterministically suppress a problem in unrelated code).

Change #1126006 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@master] Silence TRX profiler in deferreds after autocreation

https://gerrit.wikimedia.org/r/1126006

Change #1126006 merged by jenkins-bot:

[mediawiki/core@master] Silence TRX profiler in deferreds after autocreation

https://gerrit.wikimedia.org/r/1126006

Change #1126628 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@wmf/1.44.0-wmf.19] Silence TRX profiler in deferreds after autocreation

https://gerrit.wikimedia.org/r/1126628

Change #1126630 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@wmf/1.44.0-wmf.20] Silence TRX profiler in deferreds after autocreation

https://gerrit.wikimedia.org/r/1126630

Change #1126628 merged by jenkins-bot:

[mediawiki/core@wmf/1.44.0-wmf.19] Silence TRX profiler in deferreds after autocreation

https://gerrit.wikimedia.org/r/1126628

Change #1126630 merged by jenkins-bot:

[mediawiki/core@wmf/1.44.0-wmf.20] Silence TRX profiler in deferreds after autocreation

https://gerrit.wikimedia.org/r/1126630

Mentioned in SAL (#wikimedia-operations) [2025-03-11T20:35:56Z] <jhuneidi@deploy2002> Started scap sync-world: Backport for [[gerrit:1126628|Silence TRX profiler in deferreds after autocreation (T388165)]], [[gerrit:1126630|Silence TRX profiler in deferreds after autocreation (T388165)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-11T20:39:03Z] <jhuneidi@deploy2002> jhuneidi, tgr: Backport for [[gerrit:1126628|Silence TRX profiler in deferreds after autocreation (T388165)]], [[gerrit:1126630|Silence TRX profiler in deferreds after autocreation (T388165)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-11T20:49:02Z] <jhuneidi@deploy2002> Finished scap sync-world: Backport for [[gerrit:1126628|Silence TRX profiler in deferreds after autocreation (T388165)]], [[gerrit:1126630|Silence TRX profiler in deferreds after autocreation (T388165)]] (duration: 13m 05s)

Tgr reopened this task as Open.EditedMar 11 2025, 11:31 PM

Log volume is down by 10% or so since the patch was deployed. The rest is mostly due to CheckUser using a transaction callback, we might as well as fix that too.

There are a handful of warnings logged from DeferredUpdates, no idea what's up with those.

Screenshot Capture - 2025-03-16 - 21-18-12.png (562×1 px, 55 KB)

Change #1130321 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):

[mediawiki/core@master] Fully silence TRX profiler after autocreation

https://gerrit.wikimedia.org/r/1130321

Change #1130321 merged by jenkins-bot:

[mediawiki/core@master] Fully silence TRX profiler after autocreation

https://gerrit.wikimedia.org/r/1130321

Change #1130648 had a related patch set uploaded (by Bartosz Dziewoński; author: Gergő Tisza):

[mediawiki/core@wmf/1.44.0-wmf.21] Fully silence TRX profiler after autocreation

https://gerrit.wikimedia.org/r/1130648

Change #1130648 merged by jenkins-bot:

[mediawiki/core@wmf/1.44.0-wmf.21] Fully silence TRX profiler after autocreation

https://gerrit.wikimedia.org/r/1130648

Mentioned in SAL (#wikimedia-operations) [2025-03-25T14:06:10Z] <phuedx@deploy1003> Started scap sync-world: Backport for [[gerrit:1130648|Fully silence TRX profiler after autocreation (T388165)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-25T14:12:46Z] <phuedx@deploy1003> phuedx, matmarex: Backport for [[gerrit:1130648|Fully silence TRX profiler after autocreation (T388165)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-25T14:22:33Z] <phuedx@deploy1003> Finished scap sync-world: Backport for [[gerrit:1130648|Fully silence TRX profiler after autocreation (T388165)]] (duration: 16m 22s)