Page MenuHomePhabricator

RFC: Drop support for database upgrade older than two LTS releases
Open, MediumPublic

Description

  • Affected components: Mediawiki core
  • Engineer(s) or team for initial implementation: @Ladsgroup (In volunteer capacity)
  • Code steward: Platform team

Motivation

Currently, MysqlUpdater class on master has (at least on paper) support for upgrading from 1.2 (Released on 2004-05-24, which predates birth of some of our volunteer devs) to 1.35 (hasn't released yet) and it seems there is no plan to stop it and it will continue forever.

As the result:

  • Number of database checks in MySQL is reaching 500 and size of the class to 1,500 lines of code.
  • The archive of sql patches (directory of maintenance/archives/) is really large and unorganized
  • In every update.php run, the system has to run all of these checks (updatelog stores that some of them are done but it's not sql checks and it's only maintenance script runs as far as I see in my localhost)
  • Due to lots of reasons, update.php can never be ran in production but it manages to sneak in and get ran and cause outages: T157651: sql.php must not run LoadExtensionSchemaUpdates. This means reducing the probability of causing issues in case this kind of problem happen again would be nice.
  • Keeping ability to upgrade from any point in time makes schema change logic quite complex:
    • You can't reintroduce an index (with different columns for example) and it has to have a different name, otherwise the update.php in every run (on master) removes the index and re-add it again. Same goes with changing field data type twice, if you change it from blob to varchar and a couple years later to varbinary, the system on master sees that the field type is not varchar, thinks it's blob, change it to varchar, and then the next check sees that it's not varbinary, and turns it to varbinary and it happens in every update.php run. You might argue that we can remove the first schema change but what if one of the updates in between depends on this certain data type?
    • If you remove a table, you need to remove all updates related to it, otherwise the update logic will break (like T230317: Error: 1146 Table 'valid_tag' doesn't exist when upgrading from an ancient MediaWiki version)
    • This complex logic was one of reasons behind one of our biggest outages when a really important table got dropped because Wikibase assumed due to lack of wb_terms table which was meant to be dropped from the code soon, the system is upgrading from a version that's 8 years old (predating wb_terms) so dropped several tables to rebuild them. i.e. Wikibase mistook the future with the far past: T249565: Wikidata's wb_items_per_site table has suddenly disappeared, creating DBQueryErrors on page views
  • This logic never worked properly from old versions anyway and it's famous in third party users that for big jumps and large databases, it's unreliable. They usually upgrade from one LTS to another multiple times instead and given that we have VCS, it makes sense.
  • Due this complex logic, properly writing test for it is hard, there are some snapshots (in sqlite) that build the system, run upgrade on it and check if it matches with the current system but it's pretty limited and doesn't cover upgrade from all releases.
  • The complex logic is not documented and stored as institutional knowledge with low bus factor which lots of devs have to explain and repeat for every new person doing a schema change for the first time (here's an example)
  • Technically upgrading from 1.2 is impossible because the MySQL version that mediawiki 1.2 needs is so different from 1.35 that MySQL upgrade (with lots of data) would be non-trivial

Requirements
  • No current functionality for upgrading from an LTS to another should break
  • The developer productivity and onboarding cost of doing schema changes in core and extensions should improve.
  • The *Updater classes should stop growing non-stop

Exploration

Proposal:
On master, only support upgrading from the last LTS release that's not EOL'd yet (basically meaning two LTS releases). Remove all of old updates and their .sql files and make it clear in RELEASE-NOTES (add a dedicated section that upgrades from which releases are supported).

The only downsides with proposal:

  • It would be harder for people to upgrade from really really old versions and need to do it in jumps but update.php is unreliable in that regard anyway.
  • The current archives of .sql files is actually a good library to find the most similar alter table to copy-paste, specially for DBMS engines that are different from the ones the dev is familiar with (like Postgres or Oracle to me). But this will be fully addressed with abstract schema changes (T191231: RFC: Abstract schemas and schema changes) which will come in the next couple of months.

Event Timeline

Ladsgroup created this task.Aug 6 2020, 1:48 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 6 2020, 1:49 AM
Ladsgroup updated the task description. (Show Details)Aug 6 2020, 1:49 AM
Ladsgroup updated the task description. (Show Details)Aug 6 2020, 4:23 PM
Ladsgroup edited subscribers, added: tstarling; removed: CC-TimidRobot.

I suggest moving the .sql files that are no longer needed to a separate folder rather than deleting them, both for reference as a library (as the task description points out) and in case someone wants to try upgrading manually from an earlier, unsupported version.

Izno added a subscriber: Izno.EditedAug 6 2020, 7:57 PM

So just for concrete example's sake (please rebut each as appropriate):

Upgrading to 1.35 (LTS)

  1. There would not be support for 1.23 (-3 LTS) to 1.35 (LTS)
  2. There would not be support for 1.24, 1.25, or 1.26 to 1.35 (LTS)
  3. There would be support for 1.27 (-2 LTS) to 1.35 (LTS)
  4. There would be support for 1.28, 1.29, 1.30 to 1.35 (LTS)
  5. There would be support for 1.31 (-1 LTS) to 1.35 (LTS)
  6. There would be support for 1.32, 1.33, or 1.34 to 1.35 (LTS)

Upgrading to 1.36 (LTS +1)

  1. There would not be support for 1.23 (-3 LTS) to 1.36 (LTS +1)
  2. There would not be support for 1.24, 1.25, or 1.26 to 1.36 (LTS +1)
  3. There would not be support for 1.27 (-2 LTS) to 1.36 (LTS +1)
  4. There would not be support for 1.28, 1.29, 1.30 to 1.36 (LTS +1)
  5. There would be support for 1.31 (-1 LTS) to 1.36 (LTS +1)
  6. There would be support for 1.32, 1.33, or 1.34 to 1.36 (LTS +1)

?

And, if 1.36 doesn't look like that with the regular half-year release cadence, what do 1.34 (LTS-1) and 1.37 (LTS+2) look like?

Since you mentioned MySQL version changes particularly, should that affect it? (Should PHP versions?)

Akuckartz added a subscriber: Akuckartz.

I suggest moving the .sql files that are no longer needed to a separate folder rather than deleting them, both for reference as a library (as the task description points out) and in case someone wants to try upgrading manually from an earlier, unsupported version.

I have thought about it but I'm not sure it would be a good idea:

  • It would still show up in searches and creates noise in case for example we want to drop a table and search for its usages.
  • With things like git filter repo or git subtree split, you can easily extract all of them from git history (it needs a git magician but once the command is out, everyone can use it)
  • With abstract schema changes, you wouldn't need them at all, you would just put a before and after snapshot of a table and it produces it for you.

And, if 1.36 doesn't look like that with the regular half-year release cadence, what do 1.34 (LTS-1) and 1.37 (LTS+2) look like?

That's a tough question, my personal preference is to have a rather simple rule and stick to it: Support from any version since the oldest LTS that has not been EOL'd at the time of the major release must be possible.
I would be okay with any better alternatives here.

Since you mentioned MySQL version changes particularly, should that affect it? (Should PHP versions?)

Their versioning is different and also MySQL is one of several DBMS engines we support, that would make things really complicated.

Krinkle updated the task description. (Show Details)Aug 11 2020, 8:20 PM
daniel moved this task from P1: Define to P2: Resource on the TechCom-RFC board.Aug 12 2020, 8:21 PM
Leaderboard added a subscriber: Leaderboard.EditedAug 28 2020, 8:24 PM

How about change the minimum from -2 LTS to -3 LTS? Considering that existing documentation encourages users with really old versions to upgrade, but it is indeed kind of ridiculous to expect a perfect upgrade from a version as old as 1.2. I think -3 LTS is a reasonable balance of giving users running old versions a one-path method to upgrade while keeping the scripts reasonable.

I often encounter users upgrading from much older versions. At least from ~ 1.16 era.

Im def ok with dropping pre 1.6 support. Anything before the big old/cur refactor is unlikely to work.

I do think we should have better testing of this.

If we do do this, can we at least require to have code that detects the db is older than supported, and instructs users to first upgrade to mediawiki 1.xx, and then do the current upgrade?

So just for concrete example's sake (please rebut each as appropriate):

Upgrading to 1.35 (LTS)

  1. There would not be support for 1.23 (-3 LTS) to 1.35 (LTS)
  2. There would not be support for 1.24, 1.25, or 1.26 to 1.35 (LTS)
  3. There would be support for 1.27 (-2 LTS) to 1.35 (LTS)
  4. There would be support for 1.28, 1.29, 1.30 to 1.35 (LTS)
  5. There would be support for 1.31 (-1 LTS) to 1.35 (LTS)
  6. There would be support for 1.32, 1.33, or 1.34 to 1.35 (LTS)

Upgrading to 1.36 (LTS +1)

  1. There would not be support for 1.23 (-3 LTS) to 1.36 (LTS +1)
  2. There would not be support for 1.24, 1.25, or 1.26 to 1.36 (LTS +1)
  3. There would not be support for 1.27 (-2 LTS) to 1.36 (LTS +1)
  4. There would not be support for 1.28, 1.29, 1.30 to 1.36 (LTS +1)
  5. There would be support for 1.31 (-1 LTS) to 1.36 (LTS +1)
  6. There would be support for 1.32, 1.33, or 1.34 to 1.36 (LTS +1)

I think you got this wrong. The way the proposal is worded, support for upgrades would continue for old LTS releases until a new LTS release bumps the oldest supported LTS release off the list.
In other words, I believe the release of 1.36 (non-LTS) doesn't affect the migration support for 1.27+

Michael added a subscriber: Michael.
Krinkle moved this task from P2: Resource to P3: Explore on the TechCom-RFC board.Fri, Sep 18, 2:59 AM

@Ladsgroup Looks like this is ready for Phase 3, moving it on your behalf.

@Ladsgroup Looks like this is ready for Phase 3, moving it on your behalf.

Thanks!