Steps to replicate the issue (include links if applicable):
- Open https://stream.wikimedia.org/v2/stream/recentchange and look for "type": "categorize" events with non-recent timestamps
- This is frequent enough that it is reproducible in https://stream.wikimedia.org/v2/ui/#/?streams=recentchange with Ctrl+F
What happens?:
For example:
{ "$schema": "/mediawiki/recentchange/1.0.0", "meta": { "uri": "https://en.wiktionary.org/wiki/Category:Hakka_lemmas", "request_id": "ff9a1347-2eb1-49ee-8b28-20a4123e17c8", "id": "bc822b39-6850-4657-b067-eacefd7af560", "dt": "2024-04-07T01:42:44Z", "domain": "en.wiktionary.org", "stream": "mediawiki.recentchange", "topic": "codfw.mediawiki.recentchange", "partition": 0, "offset": 1544361708 }, "id": 130076381, "type": "categorize", "namespace": 14, "title": "Category:Hakka lemmas", "title_url": "https://en.wiktionary.org/wiki/Category:Hakka_lemmas", "comment": "[[:多久]] added to category, [[Special:WhatLinksHere/多久|this page is included within other pages]]", "timestamp": 1712454164, "user": "Tomascus", "bot": true, "notify_url": "https://en.wiktionary.org/w/index.php?diff=78770780&oldid=0&rcid=130076381", "server_url": "https://en.wiktionary.org", "server_name": "en.wiktionary.org", "server_script_path": "/w", "wiki": "enwiktionary", "parsedcomment": "<a href=\"/wiki/%E5%A4%9A%E4%B9%85\" title=\"多久\">多久</a> added to category, <a href=\"/wiki/Special:WhatLinksHere/%E5%A4%9A%E4%B9%85\" title=\"Special:WhatLinksHere/多久\">this page is included within other pages</a>" }
This was recorded at 2025-03-08T02:56Z. The diff appears to be https://en.wiktionary.org/w/index.php?title=%E5%A4%9A%E4%B9%85&diff=prev&oldid=78770780 based on the title and the username, which was 11 months ago.
For another example, recorded at the same time:
{ "$schema": "/mediawiki/recentchange/1.0.0", "meta": { "uri": "https://en.wiktionary.org/wiki/Category:Categories_calling_Template:auto_cat", "request_id": "c4af9333-373f-46c8-8cb0-0dd2b6c027c7", "id": "84a5909a-146f-4569-bdb7-a5e868ccd9ea", "dt": "2023-11-15T03:38:58Z", "domain": "en.wiktionary.org", "stream": "mediawiki.recentchange", "topic": "codfw.mediawiki.recentchange", "partition": 0, "offset": 1544361726 }, "id": 130076388, "type": "categorize", "namespace": 14, "title": "Category:Categories calling Template:auto cat", "title_url": "https://en.wiktionary.org/wiki/Category:Categories_calling_Template:auto_cat", "comment": "[[:Category:vi:Tangshan]] added to category", "timestamp": 1700019538, "user": "WingerBot", "bot": true, "notify_url": "https://en.wiktionary.org/w/index.php?diff=76668693&oldid=0&rcid=130076388", "server_url": "https://en.wiktionary.org", "server_name": "en.wiktionary.org", "server_script_path": "/w", "wiki": "enwiktionary", "parsedcomment": "<a href=\"/wiki/Category:vi:Tangshan\" title=\"Category:vi:Tangshan\">Category:vi:Tangshan</a> added to category" }
What should have happened instead?:
Category changes not caused by page edits should not be assigned to page edits and should be given a different timestamp, such as when the page was re-parsed to cause the categorization change.
Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
Other information (browser name/version, screenshots, etc.):
I'm still not completely sure that I understand what's going on here other than seeing a lot of categorize events with old timestamps. I think it's related to pages being purged and having templates re-parsed causing category changes not associated with a direct page edit.