Page MenuHomePhabricator

Update iwlinks table on all Wikimedia wikis
Closed, ResolvedPublic

Description

Maintenance of the interwiki map requires being able to find all existing pages which use a given interwiki prefix. The `iwlinks' table is very convenient for this. However, it only includes pages that have been edited or null-edited since the iwlinks table was added, i.e. when MediaWiki 1.17 was installed. (Purging, by contrast, updates the page_touched date but does not appear to update the iwlinks table.)

See the following three database queries for confirmation of the above:

https://quarry.wmflabs.org/query/35250
https://quarry.wmflabs.org/query/35265
https://quarry.wmflabs.org/query/35266

This means that a lot of old pages are potentially missed in searches of the iwlinks table. See prior discussion here:

https://meta.wikimedia.org/w/index.php?title=Talk:Interwiki_map/Archives/2014#Can_the_iwlinks_database_table_be_trusted%3F

At the time of the discussion, @MZMcBride suggested instead null-editing every page. However, that seems like a Herculean task compared to running a maintenance script. Nobody ever got around to implementing either of these solutions.

Event Timeline

I'm not sure if there's even an existing maintenance script that will do what I want. It looks like refreshLinks.php doesn't update the iwlinks table, except to remove entries from deleted pages. Further, the documentation for purgeList.php doesn't indicate that links are updated. Would running pywikibot touch.py genuinely be the best option here?

Btw, looks like "page_links_updated" (which I didn't realize existed) is the correct field to look at, not "page_touched". However, that wasn't added until MW 1.23.

Reedy changed the task status from Open to Stalled.Apr 30 2020, 8:09 PM

@Reedy: Which task or who is this task stalled on?

@Reedy: Which task or who is this task stalled on?

Lack of an obvious route forward; seems it will require at least some code changes

I'm not sure if there's even an existing maintenance script that will do what I want. It looks like refreshLinks.php doesn't update the iwlinks table, except to remove entries from deleted pages. Further, the documentation for purgeList.php doesn't indicate that links are updated.

@Reedy: Which task or who is this task stalled on?

Lack of an obvious route forward; seems it will require at least some code changes

Then these code changes should be outlined or discussed in this task (or dedicated subtasks should be created).
I don't see what makes this stalled by definition...

Ok so someone needs to make a task, sure. It is discussed in the comments at least

Marking it stalled at least means on the Wikimedia Interwiki workboard it’s not read to go

Umherirrender subscribed.

Looking at page_links_updated on the replica for the public wikis, this looks to be at a good state these years.

It still could have some missing rows as that happens sometimes. The refreshLinks.php is the right maintenance script to do this, that is discussed at T157670.