Page MenuHomePhabricator

Remove wgLegacyEncoding feature of Revision/BlobStore
Open, Needs TriagePublic

Description

Since version 1.5, MediaWiki stores all data in Unicode. Before that, the encoding was configurable; $wgLegacyEncoding allows current MediaWiki to work with old non-Unicode database values. This is one of our oldest pieces of technical debt. Retire it and automatically convert database rows on upgrade instead.

Event Timeline

How would attempts to load data from legacy text table entries be handled if the variable were removed? Simply erroring out would feel like scolding the user for having a wiki around for a long time and not being personally aware of internal implementation details.

(This is something that will likely hit very few wikis, but the ones it does hit would lose access to some of their data.)

Could we have a maint. script that converts the data properly and is automatically run by update.php?

Could do yeah, though since text.old_flags isn't indexed it may require a potentially slow scan through the table to look for affected rows... which is what I tried to avoid back in the day by doing the on-the-fly encoding conversion in the first place :)

Note that $wgLegacyEncoding also affects interpretation of really old passwords, which we can't convert until the user attempts to log in because we only store the hash.

"Really old" apparently means they haven't changed their password since 2004 (the check for old encodings was added in rMW8f147fa900d1: committing Hendrik Brummermann's checkPassword() patch, plus some modifications…) and haven't logged in since 2013 (conversion on login was added in rMW95a8974c6bda: Added password hashing API).

This proposal is selected for the Developer-Wishlist voting round and will be added to a MediaWiki page very soon. To the subscribers, or proposer of this task: please help modify the task description: add a brief summary (10-12 lines) of the problem that this proposal raises, topics discussed in the comments, and a proposed solution (if there is any yet). Remember to add a header with a title "Description," to your content. Please do so before February 5th, 12:00 pm UTC.

Krinkle renamed this task from Kill wgLegacyEncoding to Kill wgLegacyEncoding feature of Revision/BlobStore.Jul 18 2019, 8:28 PM
Krinkle moved this task from Untriaged to BlobStore on the MediaWiki-Core-Revision-backend board.
Krinkle renamed this task from Kill wgLegacyEncoding feature of Revision/BlobStore to Remove wgLegacyEncoding feature of Revision/BlobStore.Oct 12 2019, 10:43 PM
Krinkle moved this task from Untriaged to Not yet on the Technical-Debt (Deprecation process) board.
In T128149#2958202, Anomie wrote:

Note that $wgLegacyEncoding also affects interpretation of really old passwords, which we can't convert until the user attempts to log in because we only store the hash.

"Really old" apparently means they haven't changed their password since 2004 (the check for old encodings was added in rMW8f147fa900d1: committing Hendrik Brummermann's checkPassword() patch, plus some modifications…) and haven't logged in since 2013 (conversion on login was added in rMW95a8974c6bda: Added password hashing API).

If there is even any user left with that criteria, I'm happy with them needing to reset their password with an email and if they don't have an email, a sysadmin can take care of it.

Change 963103 had a related patch set uploaded (by Jforrester; author: Jforrester):

[mediawiki/core@master] [DNM] Drop wgLegacyEncoding entirely

https://gerrit.wikimedia.org/r/963103