Page MenuHomePhabricator

CentralAuth extension: Code stewardship review
Open, HighPublic

Description

What:

CentralAuth is the system in charge of creating MediaWiki accounts on the (public) Wikimedia Foundation wikis. Users can create an account, log in, and have that same account across the different language editons of Wikipedia, as well as on sister projects such as Wikidata, Commons, Meta-Wiki, Wiktionary, mediawiki.org etc.

Its internal auto-login mechanism is essential to cross-wiki tooling, such as:

  • viewing an article in a different language (thus technically being on a different wiki) but remain logged-in there and thus able to receive notifications and make contributions.
  • making contributions to Wikidata without leaving the Wikipedia article.
  • uploading files to Commons from within VisualEditor and MobileFrontend.
  • following links to feedback forms on meta.wikimedia.org or mediawiki.org and still being logged-in.
  • board votes, steward elections etc, via vote.wikimedia.org (SecurePoll).
  • etc.

Motivation:

This system has effectively been without a manager or tech lead for about ten years. Issues are piling up. This has comes with numerous kinds of costs and risks:

  • Regularly at high risk of breakage. Its code is among the oldest and least maintained. It is at constant risk of breakage due to being very different from our current principles. E.g. if you change something in core or far away from it in an extension, there's a good chance that if no reasable extension could be affected by your change, CentralAuth will be. This leads to train blockers, which tend to take days to be seen by anyone given no team (officially) looks after CentralAuth.
  • Death by thousand cuts affecting diversity, accessibility and performance.
    • The world around us in consistent flux. This means without regular upkeep, something that may've once been considered accessible and localised, won't be for long. Devices change. Expectations change. And even without that external churn, changes do inevitably get made to CentralAuth to unbreak other things, and those changes are not done with the review or consultation of design/accessibility/language resources. For example T237765 has been open and re-re-re-re-reported by users many times. This means we are excluding contributors before they even begin.
  • Block progress elsewhere. Best practices are not followed. Plans for holistic improvements and upkeep go nowhere. Small regressions add up and are not acted on. Given CentralAuth is involved in virtually all page views, edits and other backend requests, its suboptimal performance affects all backend requests for other features as well. It also makes it difficult or impossible to improve our performance budgets due to CentralAuth violating them by default. Examples:
  • Negligence. For several months users have noticed, mentioned and reported warnings in the Chrome browser console about login.wikimedia.org cookies. Nobody picked up on this because it isn't anyone's job to. Now we have T252236 and little time left to figure out what to do.

Are there known security issues: Yes. I think the very definition of a Wikimedia-related security issue is almost synonymous with CentralAuth. It's what decides who's who and what they can do. It's what in charge of founder rights (e.g. Jimbo), stewards, WMF staff, etc.

The full list of on-going and recent security issues is too long to mention, but just a few then.

  • Risk of breach: T112359, T197150, …
  • Risk of leaking private data or shocking data: T112320, T226212, T201568, …
  • Vulnerability to faccilitate attacks against WMF elsewhere: T244682, …
  • Medium/Low?: T234371, T237274, …

Production outages or incidents: Yes. The most recent one I could find is T226840, where most articles could not be viewed by some logged-in users due to an HTTP header bug that would break the user's login session.

Does it have sufficient hardware resources for now and the near future (to take into account expected usage growth)?
TBD

Is it a frequent cause of monitoring alerts that need action, and are they addressed timely and appropriately?
Yes, it is regularly involved in prod errors that raise the alert levels. To my knowledge these are not currently reported, investigated or addressed (aside from CPT/RelEng during train deployments).

Screenshot 2020-05-08 at 21.23.43.png (683×2 px, 152 KB)

When it was first deployed to Wikimedia production

  1. https://meta.wikimedia.org/wiki/Help:Unified_login

Usage statistics based on audience(s) served
TBD

Changes committed in last 1, 3, 6, and 12 months
TBD

Number of developers who committed code in the last 1, 3, 6, and 12 months
TBD

Number and age of open patches
TBD

Number and age of open bugs
TBD

Number of known dependencies?
TBD

Is there a replacement/alternative for the feature? Is there a plan for a replacement? No.

Submitter's recommendation (what do you propose be done?)

For an existing team to take ownership.

Event Timeline

Krinkle renamed this task from Code stewardship review: CentralAuth (login.wikimedia.org) to CentralAuth extension: Code stewardship review.May 8 2020, 8:56 PM
Jrbranaa triaged this task as High priority.Jun 3 2020, 7:09 PM
Jrbranaa moved this task from Backlog to Prioritized on the Code-Stewardship-Reviews board.

5 years and two days ago I outlined my incredibly simple plan to get rid of CentralAuth: https://www.mediawiki.org/wiki/User:Legoktm/evil-plans2.txt (loosely inspired by I believe brion's original evil-plans.txt).

The SULF and AuthManager projects got rid of an immense amount of tech debt, but I think eliminating it is the ideal way to move forward. That's easier said than done, but I think the first step is to identify how exactly to break it up (SSO, wikisets, user rights, renaming, unification, ...) and where the pieces should land. And then as things are moved, we can use that opportunity to refactor them, add tests, etc. And smaller chunks of code are easier to maintain and spread the responsibility around.

Having spent some hours poking around this code while debugging T267273: WelcomeSurvey: Change hooks used for redirection (tl;dr: unknown numbers of users are being logged-out immediately after creating an account on Wikipedia, not a great user experience), please, yes, this needs a steward, or needs to be removed/broken up per T252244#6339170. Starting with assigning a team to maintain the extension in a meaningful way is a good start, and they could then determine what to do with it.

@Jrbranaa is there an update on the code stewardship review for this extension?

5 years and two days ago I outlined my incredibly simple plan to get rid of CentralAuth: https://www.mediawiki.org/wiki/User:Legoktm/evil-plans2.txt (loosely inspired by I believe brion's original evil-plans.txt).

The SULF and AuthManager projects got rid of an immense amount of tech debt, but I think eliminating it is the ideal way to move forward. That's easier said than done, but I think the first step is to identify how exactly to break it up (SSO, wikisets, user rights, renaming, unification, ...) and where the pieces should land. And then as things are moved, we can use that opportunity to refactor them, add tests, etc. And smaller chunks of code are easier to maintain and spread the responsibility around.

Adding tech-decision-forum to take up discussion of this task and hopefully move us towards solutions.

For everyone's info, currently no Code-Stewardship-Reviews are taking place as there is no clear path forward and as this is not prioritized work.
(Entirely personal opinion: I also assume lack of decision authority due to WMF not having a CTO currently. However, discussing this is off-topic for this task.)