Page MenuHomePhabricator

Empty categories in Non-empty category on Commons
Closed, ResolvedPublic

Description

Since the 8th October empty redirected categories behave as non-empty ones.

For example:
[[:Category:Album‎]] shows 1 file, but is it empty
[[:Category:Animation]] shows 2 files
[[:Category:Demolition]] shows 4 files
and many others...

Each hour there are more such categories in Non-empty category redirects.

Event Timeline

Wieralee raised the priority of this task from to Needs Triage.
Wieralee updated the task description. (Show Details)
Wieralee added a project: Commons.
Wieralee added a subscriber: Wieralee.
zhuyifei1999 set Security to None.
zhuyifei1999 moved this task from Incoming to Backlog on the Commons board.
zhuyifei1999 added a subscriber: zhuyifei1999.
Aklapper renamed this task from Empty categories in Non-empty category to Empty categories in Non-empty category on Commons.Oct 12 2015, 9:16 AM

Thanks for taking the time to report this!

Are you logged in or not?

Wondering if this is a duplicate of / related to T114160.

Are you logged in or not?

It's happening right now, while I'm logged in. Neither action=purge nor debug=1 fix the problem.

Wondering if this is a duplicate of / related to T114160.

I don't think so. The other bug only appear for logged-out users (probably related to varnish cache). This one happens for logged in users, so probably related to database, redis object cache or memcached.

I think it can be connected with a "Cat-a-lot" error - since a few weeks is is moving files without the last one, if there are more files than ~50. It works quickly - and is stopping one file before the end... And after movement it shows one file is not moved. Sometimes it is moved, sometimes not... but for the first cache refrest the last file is shown as not moved. The second cache refresh must be done to show the truth.

MariaDB [commonswiki_p]> select cl_from, page_title from categorylinks left join page on page_id = cl_from where cl_to = 'Album';
+----------+------------+
| cl_from  | page_title |
+----------+------------+
| 44069027 | NULL       |
| 44207153 | NULL       |
+----------+------------+
2 rows in set (0.00 sec)

Looks to be the same issue as T115586 - page deletions were not removing the page from the categories they were in. There should be no new instances of this bug, since it was fixed on Monday, but the existing instances from before monday are still there.

This referential integrity issue causes the category counts to include things that were deleted (But when you actually visit the category page those links aren't shown because the page doesn't exist, and the query uses an INNER JOIN).

commonswiki is s4, so this should automatically be fixed on November 4 when refreshLinks.php --dfn-only is scheduled to run. If this is causing big problems, someone can get a shell user to run refreshLinks.php --dfn-only sooner.

As an aside, this issue is probably causing a spike in the number of queries from Category::refreshCounts (Since they get refreshed on category view if < 200 in the category, and they're always wrong, as one count is based on the inner join with page, and one isn't)