Run the new recountCategories.php script on all wikis. It has to be run three times on all wikis, once with --mode pages, once with --mode files and once with --mode subcats. You might want to pick a suitable value for --throttle as well.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | Feature | None | T18660 Database table cleanup (tracking) | ||
Duplicate | None | T18036 Number of category members (PAGESINCATEGORY) is inaccurate for large categories | |||
Resolved | TTO | T18765 Write a maintenance script to refresh category member counts | |||
Resolved | Urbanecm | T170737 Run recountCategories.php on Wikimedia wikis | |||
Resolved | None | T169964 Counter of the numbers of the pages on a category shows negative result | |||
Resolved | Urbanecm | T228585 Counter for pages in category is below zero | |||
Duplicate | BUG REPORT | None | T272821 Chinese Wikipedia: category has -1 page |
Event Timeline
Fun, yeah, I can.
Will look at running it on the test wikis today, and see how the output looks etc :)
reedy@terbium:~$ mwscript recountCategories.php --wiki=testwiki --mode=pages | tee ~/testwiki.log Finding up to 500 drifted rows starting at cat_id 500... Updating cat_pages field on 154 rows... Finding up to 500 drifted rows starting at cat_id 500... Done! Updated the pages counts of 154 categories. Now run the script using the other --mode options if you haven't already. Also run 'php cleanupEmptyCategories.php --mode remove' to remove empty, nonexistent categories from the category table. reedy@terbium:~$ mwscript recountCategories.php --wiki=testwiki --mode=subcats | tee ~/testwiki.log Finding up to 500 drifted rows starting at cat_id 500... Updating cat_subcats field on 4 rows... Finding up to 500 drifted rows starting at cat_id 500... Done! Updated the subcats counts of 4 categories. Now run the script using the other --mode options if you haven't already. reedy@terbium:~$ mwscript recountCategories.php --wiki=testwiki --mode=files | tee ~/testwiki.log Finding up to 500 drifted rows starting at cat_id 500... Updating cat_files field on 10 rows... Finding up to 500 drifted rows starting at cat_id 500... Done! Updated the files counts of 10 categories. Now run the script using the other --mode options if you haven't already. reedy@terbium:~$ mwscript cleanupEmptyCategories.php --wiki=testwiki | tee ~/testwiki.log ...Update 'cleanup empty categories' already logged as completed. reedy@terbium:~$ mwscript cleanupEmptyCategories.php --wiki=testwiki --force | tee ~/testwiki.log Adding empty categories with description pages... Removing empty categories without description pages... The category named :Sub-Sub-Category_Bleah_tst is not valid?! --mode=remove --begin=中文(简体) Category cleanup complete. reedy@terbium:~$
reedy@terbium:~$ mwscript recountCategories.php --wiki=test2wiki --mode=pages | tee ~/test2wiki.log Finding up to 500 drifted rows starting at cat_id 500... Updating cat_pages field on 25 rows... Finding up to 500 drifted rows starting at cat_id 500... Done! Updated the pages counts of 25 categories. Now run the script using the other --mode options if you haven't already. Also run 'php cleanupEmptyCategories.php --mode remove' to remove empty, nonexistent categories from the category table. reedy@terbium:~$ mwscript recountCategories.php --wiki=test2wiki --mode=subcats | tee ~/test2wiki.log Finding up to 500 drifted rows starting at cat_id 500... Done! Updated the subcats counts of 0 categories. Now run the script using the other --mode options if you haven't already. reedy@terbium:~$ mwscript recountCategories.php --wiki=test2wiki --mode=files | tee ~/test2wiki.log Finding up to 500 drifted rows starting at cat_id 500... Done! Updated the files counts of 0 categories. Now run the script using the other --mode options if you haven't already. reedy@terbium:~$ mwscript cleanupEmptyCategories.php --wiki=test2wiki --force | tee ~/test2wiki.log Adding empty categories with description pages... Removing empty categories without description pages... --mode=remove --begin=Statut_UICN_EN Category cleanup complete. reedy@terbium:~$
Unfortunately line 101 of the script is wrong. It should be printing $this->minimumId. I don't know if that really matters though, it looks like that was just intended to let you see how long the script is taking, and I guess you'll be running it headless.
Ping @MaxSem regardless.
Not sure why this task was assigned to me; I don't have, and have never had, shell access.
For future reference, this was solved in T247215.
Probably… but there's now also T224321: Run populateCategory.php and I'm not sure what's the difference.
Mentioned in SAL (#wikimedia-operations) [2021-06-22T22:38:07Z] <urbanecm> mwscript recountCategories.php --wiki=eowiktionary --mode={pages,subcats,files} (T170737)
Mentioned in SAL (#wikimedia-operations) [2021-06-22T22:41:28Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript recountCategories.php --wiki=zhwiki --mode=pages # T170737
Mentioned in SAL (#wikimedia-operations) [2021-06-22T22:42:28Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ mwscript recountCategories.php --wiki=zhwiki --mode=subcats # T170737
populateCategory.php was designed to initially populate the category table when upgrading to MW 1.13. It was deleted in rMW0dacf7d68d8d517cada731375f9612d8e060db58.
recountCategories.php is the script that should be used.
The script has been applied to -eo- wiktionary and the broken categories seem fixed. Now we can think about a more permanent solution for the problem, in the form of of running this script regularly or upon request, or by other means. Another wiki badly needing this was commons. See T85696.
T85696: Allow action=purge to recalculate the number of pages/subcats/files in a category is definitely not a good permanent solution. If the miscounts still happen (ie it's not an ancient bug we just run into), we should find out why it happens, and fix the bug.
Mentioned in SAL (#wikimedia-operations) [2021-06-23T12:15:16Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist s2 recountCategories.php --mode=pages && foreachwikiindblist s2 recountCategories.php --mode=subcats && foreachwikiindblist s2 recountCategories.php --mode=files # T170737
Mentioned in SAL (#wikimedia-operations) [2021-06-23T12:26:15Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s5
Mentioned in SAL (#wikimedia-operations) [2021-06-23T12:35:38Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s6
Mentioned in SAL (#wikimedia-operations) [2021-06-23T12:46:17Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s7
Mentioned in SAL (#wikimedia-operations) [2021-06-23T12:59:22Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s3
Mentioned in SAL (#wikimedia-operations) [2021-06-23T13:27:42Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s4
Mentioned in SAL (#wikimedia-operations) [2021-06-23T14:53:36Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s8
Mentioned in SAL (#wikimedia-operations) [2021-06-23T14:54:29Z] <urbanecm> [urbanecm@mwmaint1002 ~]$ foreachwikiindblist $SHARD recountCategories.php --mode=pages && foreachwikiindblist $SHARD recountCategories.php --mode=subcats && foreachwikiindblist $SHARD recountCategories.php --mode=files # T170737, SHARD=s1