Page MenuHomePhabricator

Database maintenance map not working
Closed, ResolvedPublic

Description

It looks like it is not logging anything (and not even the days are up-to-date) as it stopped the 31st of March:

Captura de pantalla 2025-04-08 a las 15.03.10.png (486×848 px, 71 KB)

There were definitely events today, as @FCeratto-WMF is running a schema change for T391056 which is producing !log like any other schema change.
I also ran this in the morning, which never appeared either:

06:45 marostegui: Upgrade ms2 to MariaDB 10.11 codfw eqiad dbmaint T391317

Event Timeline

Marostegui moved this task from Triage to In progress on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2025-04-08T13:08:29Z] <marostegui> TEST maintenance s1 eqiad dbmaint T391346

wikitech-maint-map |         continuous          |      Specified command fails to run

User "None" does not have required user right "edit" on site wikitech:en.

I updated pywikibot. Maybe that'd fixes it. It's probably something with SUL3

I updated pywikibot. Maybe that'd fixes it. It's probably something with SUL3

That didn't fix the issue. When I'm trying the script locally, I get 429 from varnish :/

ERROR: 172.16.3.12 != Dexbot after APISite.login() and successful ClientLoginManager.login()

Something got broken in login on wikitech. It got broken around March 31. @Tgr Do you know what could be deployed around that time. The bot logs in via bot password and it works just fine but afterwards it doesn't stay logged in. This is also only wikitech, all other wikis are fine.

This is not broken on our side and I don't have a way to fix it unless people in mw fix it. Basically every pywikibot in wikitech is broken.

Sorry I missed the ping earlier.

I tested bot login via Special:ApiSandbox and it seems to work fine. Someone will have to check pywikibot and see what exactly is happening to it - we don't have much logging for non-login API requests, I can't even tell if the bot is trying to make any requests or not.

Wikitech is using the global bot password table now, so bots need to reset their passwords, but I guess the bot in question must have done that already, otherwise the login would not have succeeded?

In general I'd recommend using OAuth 2 owner-only consumers over bot passwords, it's simpler and much less fragile. But without knowing what's happening, no idea if it would help.

Something got broken in login on wikitech. It got broken around March 31. @Tgr Do you know what could be deployed around that time.

The cookie domain changed a week before (1129215 on the 19th, 1129845 on the 20th, 1130593 and 1130607 on the 24th). In theory it shouldn't affect bot login.

SUL3 rollout for logins was also on the 24th (1130121). API logins are categorically excluded from SUL3 (and in any case the login seems to be working).

There was a config change (1131481) on the 27th that in theory only affected temp user creation.

Nothing else comes to mind.

Logins start spiking on April 1 exactly at UTC midnight. But then bot password cookies are valid until the client discards the cookie or the session store discards the session, the bot (IIRC) never clears cookies, and the session store has 24h retention, so that would implicate one of the software changes on the 31th. None of the changes that day stand out. The SUL3 change was for group 2, the EmailAuth changes don't affect bot logins.

From what I'm seeing from pywikibot logs, it goes like this: The bot logins, login works just fine, pywikibot records that it has logged in and keep the session. The next API request responds with anonymous session. pywikibot errors out.

Wikitech is using the global bot password table now, so bots need to reset their passwords, but I guess the bot in question must have done that already, otherwise the login would not have succeeded?

Yeah, it uses the shared bot password.

Nothing specific in pywikibot (upstream) wikitech config either.

From what I'm seeing from pywikibot logs, it goes like this: The bot logins, login works just fine, pywikibot records that it has logged in and keep the session. The next API request responds with anonymous session. pywikibot errors out.

Yeah, that much is obvious from the error message - Pywikibot makes a userinfo query as part of its login, to assert success, and that shows the user as anonymous.

Does that request contain a session cookie though? Are there Set-Cookie headers in the response? Do you have a request ID for it? Can you add an X-Wikimedia-Debug header to trigger debug logging?

I fully emptied the cookie jar. Let's see if that could make a difference.

Still failing:

ERROR: 172.16.6.30 != Dexbot after APISite.login() and successful ClientLoginManager.login()

Can reproduce it but PWB logs aren't helpful. I guess I will have to set up a Python debugger to understand what's going on.
(Normal password-based login works BTW.)

I switched it to normal password which is not great but fixes the problem for now. thanks for the tip. I keep it open to see what's going on with pywikibot and bot passwords in wikitech only.

You can use OAuth which is more great (security-wise at least) and isn't really affected by domains.

Ladsgroup lowered the priority of this task from High to Medium.May 30 2025, 11:56 AM

The immediate issue is fixed

Ladsgroup raised the priority of this task from Medium to High.
Ladsgroup moved this task from In progress to Done on the DBA board.

Actually I just filed T395670: pywikibot can't login in wikitech with bot password and closing this.

Can reproduce it but PWB logs aren't helpful. I guess I will have to set up a Python debugger to understand what's going on.
(Normal password-based login works BTW.)

What happens is that the bot forces a API:Userinfo call and the UI name is an IP the current user is not logged in. The log file(s) would be helpful.