Virtually everyone agrees that the status quo of auth(n) code and infra not being isolated from the rest of mediawiki (+3M LOC) is suboptimal in terms of security. We should really reduce the attack surface. Thanks to SUL3, now login mostly happens on auth.wikimedia.org domain. This makes isolation much easier and illuminates what to do next.
Here is my draft brain dump of changing the floor of disco while people are dancing:
- Short term:
- Set up a dedicated k8s namespace, let's call it mw-restricted and redirect auth.wikimedia.org traffic to it.
- Fork helm charts and strip out anything that doesn't make sense. Specially there shouldn't be a way to communicate between mw-restricted and shellbox
- Set an env var like MW_RESTRICTED in those containers
- Stop loading almost all extensions when the env var is set, we don't need the extension that produces hieroglyph images or musical notations in Special:Login.
- Allowed extensions: CentralAuth, OATHAuth, EmailAuth, ConfirmEdit (more? Abusefilter?).
- Going forward, any new extension requiring to be loaded in restricted mode should have higher security risk in security readiness review.
- Set up a new set of dedicated CI jobs similar to production and maybe even have browser tests for login to make sure nothing breaks if we remove code from being loaded in restricted mode.
- Maybe fork the mw images and remove unneeded dependencies too, for example php8.1-wikidiff2
- Fork vendor/, a new repo like vendor-restricted/ and just remove everything not needed.
- Fork core's autoload.php to like autoload-restricted.php and only load classes that are needed during auth(n). It could be as simple as denoting such classes with @allow-in-restricted-mode and then letting generateLocalAutoload.php take care of the rest. Finding them shouldn't be hard, we can use arclamp logs.
- Refactor as much as possible to remove classes from the restricted mode
- Medium term: After we are sure all pieces that deal with authentication are done in auth.wikimedia.org. Most notably: Login via API, change password, and entering "secure mode"
- Move gu_password to a dedicated table.
- Make sure user_password is never written or read. Clean it up.
- Split the gu_password table into a dedicated cluster, give it a dedicated user and password and only set it in restricted mode (complexity: restricted still needs to read from normal tables, we probably need two db users or allowing the restricted db user have access to centralauth tables)
- Unset $wmgPasswordSecretKey in non-restricted mode and even better, trigger an exception if it's accessed (I don't know if it's possible).