Page MenuHomePhabricator

Use central login wiki for login (SUL3)
Open, Needs TriagePublic

Description

CentralAuth relies on the concept of a central login wiki (e.g. login.wikimedia.org for Wikimedia wikis), but (despite what the terminology might suggest) it's not where login actually happens. Users enter their credentials in their home wiki and get logged in there, and the the central login wiki is used, through an elaborate series of redirects, to create a central session (essentially, a cookie stored on the central login wiki) and look up it to authenticate the user on other wikis.

This has the benefit that the user never leaves the wiki they are familiar with and the wiki community can customize the login and signup page (e.g. indicate username requirements specific to that wiki), and and administrative tools such as blocks and AbuseFilter filters also behave more naturally. But it causes a lot of problems:

  • Since the user expects they might need to enter their password on any wiki (even more so with central login not working reliably), an XSS vulnerability on any wiki can easily be escalated into password theft. It's hard to meaningfully monitor hundreds of wikis, and these wikis are often by design less secure than the central login wiki (more people can deploy sitewide JS, there are more gadgets and so potentially more vulnerabilities etc). A separate wiki that's only used for entering login credentials could easily be locked down to the extent where XSS on another wiki in the same farm cannot be used to interfere with login. ({T190019})
  • The redirects used for central login are indistinguishable from user tracking, from a browser's perspective; and browsers increasingly break it to prevent tracking. (T345249: Mitigate phase-out of third-party cookies in CentralAuth)
  • Related to the previous point, the authentication flows that browser vendors expect all involve a single SSO domain (without the historical baggage of CentralAuth, it's just a much more natural way of doing things) so that's what they build their heuristics on (e.g. they consider it suspicious if a domain wants to set cookies but the user has never interacted with it) and that assumption is built into web APIs (e.g. the concept of a service domain in T345589: Investigate the First-Party Sets / Related Website Sets browser API or the login flow in T335851: Investigate the Federated Credential Management browser API).
  • Various authentication-related data is stored per domain by the browser (e.g. cookies, passwords, WebAuthn T248339: Decide how to deal with WebAuthn login/registration flow on Wikimedia wikis in future) so having many wikis severely degrades the user experience of authentication, and degrades privacy (e.g. there is no practical way to delete your authn cookies on all Wikimedia projects) and security (it's easier to notice you are being phished if you always log in on the same domain, e.g. because you can expect the password to be prefilled).

We need to disable local login and use a central, single location where the user interacts with credentials. This includes at least login, signup, password change, password reset; arguably also bot passwords and OAuth.


There are a couple ways to do this:

  • Minimal-effort version: Have Special:UserLogin etc. redirect to loginwiki, do the user interaction there (if the user is already logged in, probably just show a button to confirm identity, to fulfill interaction requirements for bounce tracking prevention heuristics), redirect back on success or failure and prove to the other wiki what happened. Everything else is kept as-is. (Note: the MediaWiki / CentralAuth code change needed for this is pretty small, but many bots and apps would have to be updated, so on the whole not a small project nevertheless.) Or maybe instead of redirection, embed the relevant loginwiki form as an iframe.
  • Use FedCM (T335851: Investigate the Federated Credential Management browser API) if it becomes available, which is vaguely similar (and requires a non-FedCM fallback in any case, for older browsers and non-browsers) but requires a couple extra APIs and a "popup" layout for the login form.
  • As above, but also rethink session handling (as the half dozen auth cookies used by MediaWiki don't make that much sense when login is happening elsewhere) and use something like JWT cookies (e.g. OAuth 2 access tokens) instead.
  • The same but (taking advantage of using an industry-standard format for sessions) replace loginwiki entirely with a small dedicated app to reduce attack surface (cf T120484: Create password-authentication service for use by CentralAuth). Note that this would affect a number of important steward workflows so it needs careful investigation.

These options could also be done as successive stages (at the cost of some overhead).

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenTgr
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
In ProgressDAlangi_WMF
OpenNone
OpenTgr
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenTgr
OpenNone
OpenNone
OpenNone
OpenNone
Openmatmarex

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Having one login domain will also help with things like passkeys and webauthn, which are domain bound.

That does also mean that moving login domains, could lock out some WebAuthn users if we are not careful. We will have to do some extra steps to ensure these people can move over to the new situation.

FedCM is indeed Chrome-only at this time, and insufficient as the only solution. I suppose we could invest in it after completing one of the others. That is, if it would significantly improve the user experience (simpler or more reliable), and if it seems worth the added complexity and on-going maintenance of having multiple ways to login and varied by browser.

At glance, based on reviewing Google's article on FedCM, it's not obvious to me that FedCM actually offers an improved user experience. The reliance on client-side JS calls, and seemingly uncontrolled proactive native prompts by the browser, could arguably be seen as annoying. Plus, it seems to be optimised for a JS-rendered application where you become logged in without a page reload. I suspect, if you turn the proactive prompt off and bind it to a click on "Log in" (is that possible?), and plug the JS callback to perform a page reload instead, you end up with essentially the same workflow as without FedCM, except with quite a bit more code to maintain, and with more ways for it to fail, and more differences between local/beta/prod.

Personally, I quite like the agency of deciding to click "Log in", and have that single click log you in (over redirects) and/or show your account status on loginwiki clearly. It offers people more agency, more transparancy, and also looks more legitimate in that it builds on using the domain as trust authority. The sense I get is that FedCM is mainly intended for a longer tail of use cases when the "local" site is considered a third-party to the login provider, as a more user-friendly alternative to an OAuth popup, and heavily influenced by "content" sites that "want" you to be logged-in in order to let you "pay" with your personal information and withholding content unless you're logged in. That explains the proactive nag, and that nag is relatively useful in the local maximum of a content site withholding content unless you're logged in. There's not much to lose in that case, and even when viewing "free" content, such sites tend to still prefer you be logged-in to improve profiling, tracking, and personalised recommendations. None of that applies to Wikipedia, though.

There are a couple ways to do this:

  • Minimal-effort version: Have Special:UserLogin etc. redirect to loginwiki, […]
  • Use FedCM (T335851) […]
  • As above, but also rethink session handling […] and use something like JWT cookies […]
  • The same but […] replace loginwiki entirely with a small dedicated app […]

Moving logins to loginwiki might be a good first step here, even if we later try one of the others. It would allow us to decom a bunch of old code, and simplify things iteratatively. Once things are sufficiently simple, we could then consider e.g. a standalone endpoint, or possibly a more stripped-down endpoint (e.g. akin to how api.php/Special:ApiHelp is stripped down as well, that has the benefit of keeping the app within the CentralAuth repostory, and all the development/testing/deployment benefits that brings).

I think this approach also benefits from solving the immediate problem of third-party cookies first, and allowing most clean up to continue after the fact instead of being blocked by it, thus keeping us on-schedule, and without requiring additional risks as part of the first deliverable.

FedCM is indeed Chrome-only at this time

FWIW it's supported by Chrome, Edge and Opera (and the Firefox nightlies).

At glance, based on reviewing Google's article on FedCM, it's not obvious to me that FedCM actually offers an improved user experience. The reliance on client-side JS calls, and seemingly uncontrolled proactive native prompts by the browser, could arguably be seen as annoying.

It provides functionality similar to the current (non-top-level) autologin: if the browser has used FedCM on the given domain in the past, and the user checked the "stay logged in" option, JS code can request a non-interactive login. It's a mild improvement in terms of generic login experience; where it could more useful is XSS prevention. There are three main ways a hijacked gadget can be used for privilege escalation: have a privileged user directly do something via XSS AJAX, steal a session cookie, or fake a login form via Javascript and steal a password. The first and second can be prevented by reauthentication before sensitive operations; the third will be prevented by doing logins on a separate wiki with non-customizable JS (given sufficient user awareness, I suppose). Interactive reauthentication is kind of disruptive, but requesting a fresh central session via FedCM without user interaction isn't. (This would basically be T208823: Support asynchronous reauthentication.) I haven't really thought this through yet but it seems doable.

I suspect, if you turn the proactive prompt off and bind it to a click on "Log in" (is that possible?), and plug the JS callback to perform a page reload instead, you end up with essentially the same workflow as without FedCM, except with quite a bit more code to maintain, and with more ways for it to fail, and more differences between local/beta/prod.

The prompt isn't really proactive, you need to call a JS method. Some websites just initiate it aggressively. More importantly, you can attempt a login without a prompt, if the user gave permission in the past. See mediation options in the CM API (which FedCM is an extension of).

Personally, I quite like the agency of deciding to click "Log in", and have that single click log you in (over redirects) and/or show your account status on loginwiki clearly. It offers people more agency, more transparancy, and also looks more legitimate in that it builds on using the domain as trust authority.

On the other hand, loginwiki is a wiki (it could be a standalone app, but as you point out, that's unwise as an immediate goal). That's potentially confusing to the user, who might believe they are still on the origin wiki, might leave comments on loginwiki etc.

The sense I get is that FedCM is mainly intended for a longer tail of use cases when the "local" site is considered a third-party to the login provider, as a more user-friendly alternative to an OAuth popup

Yes in the sense that everything is mainly intended for that, because that is the dominant SSO login scenario on the web. The reason we have problems in the first place is that our login flow is unusual and not something browsers actively try to accommodate.

and heavily influenced by "content" sites that "want" you to be logged-in in order to let you "pay" with your personal information and withholding content unless you're logged in.

That's how many sites use FedCM. I don't think there is anything in the API that's particularly adapted to that, though. It's constrained a lot by compatibility with the CM API, which was designed for sites doing their own login, so the site has a high level of control over how and when login is initiated. (Also it's a W3C spec and I think faring better than the more commercially-minded Chrome proposals. E.g. Mozilla is mostly supportive.)

Moving logins to loginwiki might be a good first step here, even if we later try one of the others. [...] I think this approach also benefits from solving the immediate problem of third-party cookies first, and allowing most clean up to continue after the fact instead of being blocked by it, thus keeping us on-schedule, and without requiring additional risks as part of the first deliverable.

Agreed - it is the simplest option, and mostly involves changes that we would need for the other options as well (except FedCM, but that needs a fallback). I think there are strong security reasons to not use MediaWiki for login in the long term; but they are not relevant to third-party cookie deprecations and those happen on a tight timeline so it would be a bad idea to try to do that right now.

Another thing to consider is the CreateAccount process. Currently accounts can not be directly created in login wiki, and this task proposed loginwiki to also handle such process.
However:

  • This will make creation of global accounts bypassing any local blocks (IP/IP range block, account block and autoblock), and AbuseFilter
  • If someone could not create account locally try to view (if auto-login work) or login in a local wiki, the user will be not logged in in the local wiki. Note if the issue is username disabled by AbuseFilter, they can otherwise edit the local wiki without log in.

A simple solution is make CentralAuth auto-creation complete non-blocking, i.e. not affected by blocks or abusefilter, but it has risks. See also another task (to create soon).

Login/out data from this wiki still needs to be available for a check user on the actual "user" wiki where the login/logout was initiated.

Login/out data from this wiki still needs to be available for a check user on the actual "user" wiki where the login/logout was initiated.

Then it is bypassable by abusers. An alternative solution is make loginwiki checkuser data available to all checkusers in all wikis.

Then it is bypassable by abusers. An alternative solution is make loginwiki checkuser data available to all checkusers in all wikis.

This would make local checkusers able to view data of any user globally, no?

loginwiki only contain information about user login.

  • This will make creation of global accounts bypassing any local blocks (IP/IP range block, account block and autoblock), and AbuseFilter

More generally, do we treat the user as created on loginwiki and autocreated on the origin wiki, or as created on the origin wiki? This has all kinds of consequences - logged differently to Special:Log, different hooks get called, LocalUserCreated gets an $autocreated parameter which many extensions rely on to differentiate between "new user" and "existing user visiting a new wiki" behavior.

The technical implementation is very different, too - in the first case, we just need to change how we generate signup links; in the second case, we need to use a redirect-based authentication provider on the origin wiki, and (more problematically) do a signup flow without actually (in the MediaWiki sense) signing the user up on loginwiki. Or change CentralAuth to use the normal account creation flow if the account already exists on the login wiki but nowhere else.

I think the second would be much preferable, but I'm not sure how feasible it is - this is going to be one of the first things we'll need to investigate. (The same applies to login, too - should we use a redirect provider or some non-AuthManager mechanism? And for things like password change, do we want to add a redirect mechanism to AuthManager?)

To your narrower point, I would see ignoring blocks as more of a feature than a bug (although per above, I'd rather avoid it for architectural reasons). Blocks that prevent account creation are very problematic in terms of fallout - it's harder to communicate with the blocked person, there is no way to request an exemption, etc. And the benefit over normal blocks is minimal - it mostly just prevents adding abusive usernames to the registration log and ListUsers, and someone sophisticated enough to troll that way probably won't be stopped by narrow blocks.

Ignoring abuse filters, especially username filters (and other similar mechanisms, like the local title blacklist) would be more problematic though. I think the best approach there is to make loginwiki check if the account is registrable on the originwiki and block account creation otherwise.

  • If someone could not create account locally try to view (if auto-login work) or login in a local wiki, the user will be not logged in in the local wiki. Note if the issue is username disabled by AbuseFilter, they can otherwise edit the local wiki without log in.

I think we should 1) generally prevent central account creation if local account creation would not pass (it's hard to be 100% accurate about this but the common cases can be handled easily), 2) as you say, prevent wikis from blocking autocreation.

generally prevent central account creation if local account creation would not pass

Really silly idea if you can not create account from wiki A but can from wiki B (or via viewing loginwiki directly).

Meh, a simple alternative is to extend blocks from local wikis to login wiki. But if is for all IP/range blocks, this means any wiki can prevent user creation from specific range globally. Maybe autoblock only, but they only last for one day, though there are proposal to extend it: T43479: [Spam/vandalism] Raise $wgAutoblockExpiry noticeably - Note if I read the task correctly, the current blocker of increasing $wgAutoblockExpiry is not a technical one, but a social one, namely we need to measure how increasing it is effective.

More generally, do we treat the user as created on loginwiki and autocreated on the origin wiki, or as created on the origin wiki?

Note if we choose the first one, there are many to fix: one is NewUserMessage, in many wiki it is configured not to welcome auto-created accounts. Another is Growth, which provides a new user homepage that will be redirected to upon user creation.

and other similar mechanisms, like the local title blacklist

See also $wgTitleBlacklistBlockAutoAccountCreation (T35429), which is currently disabled in SUL wikis.

Really silly idea if you can not create account from wiki A but can from wiki B (or via viewing loginwiki directly).

Why? The overwhelming majority of users only use a single wiki (plus maybe Commons and Wikidata which have lax username policies) and don't care about the rest. So they need to meet one set of username policies but the rest of the username policies are completely irrelevant to them.

And the benefit over normal blocks is minimal - it mostly just prevents adding abusive usernames to the registration log and ListUsers

This is a pretty unnuanced statement even so far as to be blatantly wrong. Admins and CheckUsers regularly issue account creation blocks and they do stop disruption of the socking kind rather than the 'abusive user name' kind. And actually:

it's harder to communicate with the blocked person, there is no way to request an exemption

I don't really find this true either, local wikis have the workflows necessary, including that the block reason appears to the user who has been blocked, which explain how to deal with such blocks.

We should talk to anti-harassment team and stewards (from T354487)

Yes, I think you should. You should add the significant sized functionary teams on specific wikis as well (at a minimum English Wikipedia given your statements here and there), as steward practices do not reflect local wiki practices. I didn't CC Dreamy Jazz but did consider earlier -- they are a good start to that discussion given their maintenance of Ext:CU.

More generally, do we treat the user as created on loginwiki and autocreated on the origin wiki, or as created on the origin wiki? This has all kinds of consequences - logged differently to Special:Log, different hooks get called, LocalUserCreated gets an $autocreated parameter which many extensions rely on to differentiate between "new user" and "existing user visiting a new wiki" behavior.

Creation also needs to appear on a local wiki for a relevant CU. (NB, both this statement and my initial statement are current requirements and functionality of Ext:CU.) We aren't going to go chasing stewards for this information.

Admins and CheckUsers regularly issue account creation blocks and they do stop disruption of the socking kind rather than the 'abusive user name' kind.

You can't do much socking if you are blocked, even if you are allowed to create an account. I guess it can be used to mitigate the creation of long-term sleeper accounts somewhat.

it's harder to communicate with the blocked person, there is no way to request an exemption

I don't really find this true either, local wikis have the workflows necessary, including that the block reason appears to the user who has been blocked, which explain how to deal with such blocks.

The user doesn't actually see any block reason when they are blocked from autocreation, all they see is that they are inexplicably not logged in.

Yes, I think you should. You should add the significant sized functionary teams on specific wikis as well (at a minimum English Wikipedia given your statements here and there), as steward practices do not reflect local wiki practices. I didn't CC Dreamy Jazz but did consider earlier -- they are a good start to that discussion given their maintenance of Ext:CU.

Thanks for the suggestions. And yes we definitely need to talk to the maintainers of global extensions (CentralAuth, CheckUser, AbuseFilter etc) but we should probably come up with a high-level technical proposal first. I'm still fuzzy on the details of how we'd integrate the login flow into AuthManager on the loginwiki side.

You can't do much socking if you are blocked, even if you are allowed to create an account.

I have no idea what this is supposed to mean, since the plain text reading of this one is pretty wrong (else we wouldn't have the idea of socks :).

I guess it can be used to mitigate the creation of long-term sleeper accounts somewhat.

Auto blocks do drop off pretty quick, so yes, this is why we block both accounts and unregistered users with account creation blocked.

The user doesn't actually see any block reason when they are blocked from autocreation, all they see is that they are inexplicably not logged in.

Auto creation and account creation are two separate concepts...

You can't do much socking if you are blocked, even if you are allowed to create an account.

I have no idea what this is supposed to mean, since the plain text reading of this one is pretty wrong (else we wouldn't have the idea of socks :).

The status quo is that you do something bad, you get IP-blocked with a no-account-creation block (via autoblock, or checkuser, or because you did it as an anon), and that prevents you from creating a sock. In a hypothetical world where local blocks can't prevent account creation (as I said above, I don't think that's how it's going to work, but for the sake of discussion), you get IP blocked, and that doesn't prevent you from creating a sock, but then the sock is still blocked by the IP block and can't do anything. You can just wait it out until the block expires and abuse the account then, but you could do that in the original scenario just as well (only you'd also have to delay account creation as well).
Am I missing something? (Other than the minor annoyances you can do with blocked accounts, like spamming offensive usernames, which is a legitimate problem, but there might be better ways of handling it.)

Auto creation and account creation are two separate concepts...

Yes but account creation blocks affect both.

The way I currently imagine it, SUL3 signup wouldn't be very different from current signup in this regard. Currently you go through the signup form on the local wiki, local and global restrictions get applied, the account gets created, and then the account gets autocreated on the central wiki via Special:CentralLogin. With SUL3, the local signup page would redirect you to the central signup page, the central wiki would fetch restictions from the local wiki and apply them, the account would get created, you would be returned to the local wiki, and the account would get created there as well.

(That's two account creations, which is the messy part. But changing it to autocreation on the local wiki would have lots of disruptive UX effects, and on the central wiki we do need an account because we need to set the central session, because we can't reliably set session cookies without user interaction. Anyway, I think this is orthogonal to the concerns about blocks.)

Auto blocks do drop off pretty quick

See T43479: [Spam/vandalism] Raise $wgAutoblockExpiry noticeably: note if I read the task correctly, the current blocker of increasing $wgAutoblockExpiry is not a technical one, but a social one, namely we need to measure how increasing it is effective.

One thing to potentially look into is heuristic-based exceptions in browser third-party cookie policies (Chrome, Firefox, Safari). These won't stay in place forever, but might provide a simple solution for a while.

These won't stay in place forever, but might provide a simple solution for a while.

We should get in touch with developers of Chromium and Firefox so that they may adjust particulars based on our use cases.