Page MenuHomePhabricator

Remove SUL3 opt-in code from CentralAuth
Closed, ResolvedPublic

Description

There is a lot of code in CentralAuth that's specific to how we rolled out SUL3. Once the rollout is done and a couple weeks passed (so we are confident it won't be rolled back), that code will be a maintenance burden with no benefit, and we should remove it.

Maybe we want to keep isSul3Enabled() and simplify it to a simple per-wiki yes/no flag, for third-party wikis, but I'm not sure we want to do even that much, rather than just assuming SUL3 is always enabled. Although we have to keep in it at least a few places in the code for at least a year because of API backwards compatibility (e.g. T364829: Update Wikimedia apps to use central login domain).

(We shouldn't require users of CentralAuth to have a shared domain, that's a very Wikimedia-specific complex hack, but it should be possible to just point the shared domain prefix to a real wiki.)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Should include this:

That also means we should probably revert rECAU3b628d0b69b0: Point autologin resourceloader module URL to `/start` endpoint - we can just use checkLoggedIn, always use it in SUL2 mode (we can flip that once SUL3 is deployed to most users) and have it fall back to the other mode.

RhinosF1 subscribed.

Maybe we want to keep isSul3Enabled() and simplify it to a simple per-wiki yes/no flag, for third-party wikis, but I'm not sure we want to do even that much, rather than just assuming SUL3 is always enabled. Although we have to keep in it at least a few places in the code for at least a year because of API backwards compatibility (e.g. T364829: Update Wikimedia apps to use central login domain).

I would quite like this. SUL3 is going to be a huge effort and if we can avoid having to do it at the same time as the 1.44 upgrade then that would be helpful. This would give us the flexibility of doing the migration any point between 1.44 and 1.45.

I know it's a Wikimedia centric extension but we're now a farm with 14,000 wikis and ~3 billion monthly requests (including automated & blocked traffic). It's not an easy feat.

SUL3 is three migrations in parallel:

The second and third shouldn't be interesting to smaller wiki farms. The one feature of the shared domain we rely on code-wise is that you can see which wiki the user came from, so we can redirect back there. There are also some baked-in assumptions around the shared domain using the same language, which should be trivial to fix.

So I think we can just have smaller wiki farms configure the shared domain to point to an actual wiki instead, and when that is done, add the source wiki as an extra URL parameter. That leaves B/C support for SUL2, which we can leave around for one more release (that's just a couple weeks away from the end of the Wikimedia deployment, anyway). After that, we drop SUL2 support, which in theory should only require a small configuration change from reusers, but there will be a bunch of small user-visible changes, so leaving them some time for the transition certainly seems nicer.

The third isn't of interested to me, we can change all wikis at once. I'd just like us to be able to plan it properly. The second kind of is because I'd love to use the SUL3 opportunity to get rid of the historical abomination that is using loginwiki for GlobalUserPage

Some of the opt-in code will need to stay until we stop doing the loginwiki fallback (T391284: Swap order of central autologin lookup for loginwiki and shared domain).

I think we should keep [support for SUL2 mode] for a few weeks at least for ease of testing of changes. To some extent we'll need to keep it as long as central autologin falls back to loginwiki. Plus we need to keep B/C for the login/signup APIs for quite a long time; right now we are doing that via the SUL2 / SUL3 opt-in mechanism (there might be a better way, not sure).

There's also the question of what to keep supporting for local development setups and third-party wikis. I think we should support SUL2 and SUL3 (i.e. login locally vs. login on the central domain) for the next release (which is nowish) only, with the ability to switch via configuration between them, and support both an actual wiki and a shared domain as the central domain indefinitely (as setting up a shared domain is more complex).

matmarex subscribed.

I think we can remove support for:

  • CentralAuthEnableSul3 'cookie' value (the cookie opt-in)
  • CentralAuthEnableSul3 'global-pref' value (the global preference opt-in)
  • Sul3RolloutUserPercentage (staged rollout for logins)
  • Sul3RolloutAnonSignupPercentage (staged rollout for signups)

We need to keep support for:

  • CentralAuthSharedDomainCallback false value, obviously (which disables SUL3 completely) – probably indefinitely, for sites like Miraheze, and much simpler setup for local testing
  • CentralAuthEnableSul3 'query-flag' value (usesul3 URL parameter) – although maybe we could drop the config setting, and keep the feature always enabled, I'm pretty sure we've been using that query param in many places as if it was always available. Needed for at least: WebAuthn migration T376021, small-scale testing of SUL2 things like T390514, and autologin fallback to loginwiki T391284.
  • CentralAuthEnableSul3 'always' value (enabling SUL3 by default, or disabling it by default so that you can test the configuration with the URL parameter)
  • Special cases that disable SUL3 in the API (for things like T390751)

Correct me if any of these are wrong. I'll start working on the patches in the meantime.

  • CentralAuthEnableSul3 'query-flag' value (usesul3 URL parameter) – although maybe we could drop the config setting, and keep the feature always enabled

Yeah I think the config setting could just become a boolean, with the query flag always overriding it. Originally I thought it's best if we can be confident that SUL3 code paths are definitely not invoked in production to limit the impact of e.g. security issues, which might or might not have made sense but definitely isn't a relevant consideration now.

I think the query flag needs to stay as long as CentralAuthEnableSul3 stays; migration and migration testing aside, it's needed to keep track of whether a given workflow is SUL2 or SUL3 (given that CentralAuthEnableSul3 is a per-wiki flag, and these are usually cross-wiki flows).

What could at some point be removed is the assumption that SUL2 and SUL3 use a different central domain. In SUL2, the central domain is only used for central login / central autologin; those don't really depend on whether SUL3 is enabled. Using https://auth.wikimedia.org/<wikiid>/wiki/Special:CentralAutoLogin for SUL2 should be fine. The question there is how long we want to keep the loginwiki fallback.
Maybe we should just rip that bandaid off as soon as we are confident we aren't going to switch back at large scale. I never added a proper migration step for the domain change - falling back to loginwiki won't result in an auth.wm.o session being created. So we don't win much by keeping it around for long.

Change #1135952 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/CentralAuth@master] Remove support for CentralAuthEnableSul3 'global-pref' option

https://gerrit.wikimedia.org/r/1135952

Change #1135953 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/CentralAuth@master] Remove support for Sul3RolloutAnonSignupPercentage option

https://gerrit.wikimedia.org/r/1135953

Change #1135954 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/CentralAuth@master] Remove support for Sul3RolloutUserPercentage option

https://gerrit.wikimedia.org/r/1135954

Change #1135955 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/CentralAuth@master] Remove support for CentralAuthEnableSul3 'cookie' option

https://gerrit.wikimedia.org/r/1135955

Change #1135956 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):

[mediawiki/extensions/CentralAuth@master] Remove last bits of SUL3 opt-in rollout code

https://gerrit.wikimedia.org/r/1135956

Change #1135952 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] Remove support for CentralAuthEnableSul3 'global-pref' option

https://gerrit.wikimedia.org/r/1135952

Change #1135953 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] Remove support for Sul3RolloutAnonSignupPercentage option

https://gerrit.wikimedia.org/r/1135953

Change #1135954 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] Remove support for Sul3RolloutUserPercentage option

https://gerrit.wikimedia.org/r/1135954

Change #1135955 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] Remove support for CentralAuthEnableSul3 'cookie' option

https://gerrit.wikimedia.org/r/1135955

Change #1135956 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] Remove last bits of SUL3 opt-in rollout code

https://gerrit.wikimedia.org/r/1135956

  • CentralAuthEnableSul3 'query-flag' value (usesul3 URL parameter) – although maybe we could drop the config setting, and keep the feature always enabled

Yeah I think the config setting could just become a boolean, with the query flag always overriding it.

Done in https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/1135964: Simplify remaining CentralAuthEnableSul3 options (I tagged it with the wrong task).

What could at some point be removed is the assumption that SUL2 and SUL3 use a different central domain. In SUL2, the central domain is only used for central login / central autologin; those don't really depend on whether SUL3 is enabled. Using https://auth.wikimedia.org/<wikiid>/wiki/Special:CentralAutoLogin for SUL2 should be fine. The question there is how long we want to keep the loginwiki fallback.
Maybe we should just rip that bandaid off as soon as we are confident we aren't going to switch back at large scale. I never added a proper migration step for the domain change - falling back to loginwiki won't result in an auth.wm.o session being created. So we don't win much by keeping it around for long.

Do you want to have a task for this, or is it just an idea?

I guess we started discussing that in T391270#10734915, so let's continue there.

Change #1137291 had a related patch set uploaded (by Krinkle; author: Derick Alangi):

[mediawiki/extensions/CentralAuth@master] SharedDomainUtils: Remove unused constructor params

https://gerrit.wikimedia.org/r/1137291

Change #1137291 merged by jenkins-bot:

[mediawiki/extensions/CentralAuth@master] SharedDomainUtils: Remove unused constructor params

https://gerrit.wikimedia.org/r/1137291