Page MenuHomePhabricator

wrong timestamps of pages being included to categories
Open, Needs TriagePublicBUG REPORT

Description

all the dates on which pages are said to have been added to a category are suddenly in february 2025?

Steps to replicate the issue (include links if applicable):

What happens?:

What should have happened instead?:

  • all inclusions from beginning of wikipedia until 2025-02-07 00:00:00: >3000 (almost all)
  • all inclusions from 2025-02-07 00:00:00 until 2025-02-09 00:00:00: <100

Software version

current dewiki:
1.44.0-wmf.15 (e73fe46)
2025-02-04T02:27:06

Other information (browser name/version, screenshots, etc.):

Event Timeline

It looks too correlated to be a coincidence but reading migrateLinksTable, there is no way mediawiki would touch cl_timestamp column since it's not in the code at all. It can be because of write both?

migrateLinksTable goes through by page id and if you check https://de.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=Category:Gestorben_2000&cmsort=timestamp&cmnamespace=0&cmlimit=max&cmprop=title|timestamp|ids they are clearly ordered by page id too, so it's very very likely caused by the script. The only problem is that the update function doesn't touch cl_timestamp at all. It can be either something in mediawiki out of the maint script deciding that the row has changed and thus bumping timestamp, or worse, a trigger in production rewriting it.

Looking at the binlogs, I'm confirming it's not from the maint script write queries directly. All of the writes to cl_timestamp are from LinksTable::doWrites and for whatever reason it has decided to bump the timestamp, which can be because of the fact that rows now has cl_target_id in them but I'm not seeing that logic in the code (CategoryLinksTable class)

thanks for investigating in this.
is it possible to restore the old dates?

I don't think that'd be easy to do. It'll be quite a bit of work. It won't bump again though.

oh, the timestamps have changed again, such that my second link (all inclusions from 2025-02-07 00:00:00 until 2025-02-09 00:00:00) will give 0 results (and not >3300).

currently all inclusions from 2025-02-11 00:00:00 until 2025-02-10 00:00:00 will give >3300 results instead.

It won't bump again though.

does this mean that the timestamps at least won't change again in near future?

oh, the timestamps have changed again, such that my second link (all inclusions from 2025-02-07 00:00:00 until 2025-02-09 00:00:00) will give 0 results (and not >3300).

currently all inclusions from 2025-02-11 00:00:00 until 2025-02-10 00:00:00 will give >3300 results instead.

It won't bump again though.

does this mean that the timestamps at least won't change again in near future?

Sigh I think this time it got bumped due to adding of cl_collation_id values. It really shouldn't change anymore. If it does, then it's not a categorylinks refactor issue.

I also stumbled across this problem today. Do you still plan to restore the old values? And (more out of curiosity): Why do I still see one entry from 2024 in this list when everything was overwritten in February? (Hm. This article was moved on the day of the migration. A race condition maybe?)

It looks like it would be better, to save old database tables in a trashbin before changing database structures.