Implementation of T222224: RFC: Normalize MediaWiki link tables. See that ticket for details and reasoning.
Description
Event Timeline
I've reviewed the plan with @tstarling and we both think it's ready to implement. I believe the last iteration of the proposal to not be cross-cutting or otherwise in need of wider consultation as it does does not affect public APIs, nor should it affect extensions, or other features and engineering teams.
This will be a big project and take some time to complete. I've confirmed with @Ladsgroup that the Data Persistence team has approved resourcing for this in this and coming quarters to see it through to completion in collaboration with Performance.
Our database internals are not part of the Stable interface, and so any extensions querying or writing here directly are by definition unsupported, and I'm not aware of existing technical debt in core or bundled/deployed extensions that do this
As with any schema, one notable case outside production where querying these databases does regularly happen, is Toolforge. So this will need coordination with Technical Engagement on communicating these changes, and possibly also on offering an intermediary database-view for compat (to be determined). Depending on how fast our migration goes, and the availability and resourcing of TechEng, I anticipate that the final state of the migration (where we stop writing to the old table column) may have to be post-poned to fit their schedule. It is unlikely we will get to that so soon though, so I think this is fine to start working on, and recommend that we estimate now in which quarter the last phase will be reached, and then reach out to TechEng for which quarter they can help us with that part of the roll out, and then we can adjust as needed.
Correct, I'm planning to send an announcement about this soon-ish so I would see if we need to provide a view or we simply can drop it. I doubt it'd be too popular but I'll announce it soon regardless.
This is for all MediaWiki installations; the migration will happen as part of the normal update.php process for users who run that, and otherwise the maintenance script will be manually runnable (such as for Wikimedia ourselves).
Yes. From 1.39 onwards, new installation will use the new schema. Won't make it to 1.38 though.
I don't think so, there is not much gain because there is not much duplication, redirect table is small and there is an overhead that comes with normalization which makes not worth it.
Hi, just a reminder to update the schema in https://commons.wikimedia.org/w/index.php?title=File:MediaWiki_database_schema_latest.svg&redirect=no when the work is finished (or perhaps also after each of the normalizations?)
@Dvorapa I no longer update these in SVG form. Instead, we now have https://www.mediawiki.org/wiki/Manual:Database_layout/diagram which can be quickly updated on-wiki by developers with the procedure largely automated now. We publish this twice a year after a major release. It is not published for alpha commits.
Developers that build atop the alpha software prior to release, may consult the schema files directly as-needed.
I see, should be mentioned at the image page and perhaps the redirect from commons should be changed too
@Ladsgroup: I saw that in Scribunto protocol-relative links are outputted by default at least for mw.title generator (maybe for others as well), for example, in mw.title.new( 'Example' ):fullUrl( 'action=edit' ). Is this a problem that needs to be fixed in Scribunto? I read email from T335819 and it’s a bit confusing. It says there that the table no longer stores those links in HTTP but also this:
If your wiki heavily uses proto-relative URLs in articles' wikitext, we recommend changing them to https instead which also improves storage as every proto-relative URLs takes up two rows.
I just thought I’d let you know since obviously Lua use is very widespread in templates.
Thanks for the pointer. To my knowledge local domains are not stored in externallinks at all (which has confused me a lot multiple times) so this shouldn't be an issue. Do you see it being recorded?
Judging by https://ru.wikipedia.org/w/index.php?title=Служебная:Поиск_ссылок&limit=500&offset=0&target=https%3A%2F%2Fru.wikipedia.org%2F you are correct. Then it doesn’t need fixing, I guess.