Wikimedia databases contains categorylinks with type "page" for media files. Run updateCollation.php to fix
Closed, ResolvedPublic

Description

Commons files with type "page" in categorylinks table

The Commons database contains categorylinks for files with type "page". But files should have type "file".

I think this error may exist in other databases too. Bug 29787 was about a category in English Wikipedia, but that was fixed by null edits. Now null edits does not seem to fix the errors at Commons.

Bug 29787 has an example of problems caused by this bug.


Version: unspecified
Severity: normal
URL: http://toolserver.org/~endumen/fileswithpagetype.txt

attachment ignored as obsolete

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz35609.
Lejonel created this task.Via LegacyMar 30 2012, 12:40 PM
MarkAHershberger added a comment.Via ConduitApr 2 2012, 4:57 PM

Moving attachment contents to url field.

MarkAHershberger added a comment.Via ConduitApr 2 2012, 4:58 PM

The content of attachment 10352 has been deleted by

Mark A. Hershberger <mah@everybody.org>

who provided the following reason:

should be in url field

The token used to delete this attachment was generated at 2012-04-02 16:58:03 UTC.

Bawolff added a comment.Via ConduitJun 16 2012, 7:30 PM

All of these appear to have cl_collation field set to ''. Thus running updateCollations.php should fix the issue.

My query on toolserver db said that there were 48331 such affected rows on commons. And they seem to have happened on July 7, 2011. I don't know what happened at that time. The pages I looked at weren't edited at that time. In the server admin log the only thing happening to commons at that time that i saw was CleanupTitles.php and NamespaceDupesWT.php were running. I'm not sure how that could cause this.

Anyhow running updateCollation.php should fix the issue as it will see it as an old style row and update it appropriately.

Bawolff added a comment.Via ConduitJun 16 2012, 7:34 PM

Note, there is also 15695 such rows on enwiki.

Reedy added a comment.Via ConduitJul 6 2012, 4:39 PM

mysql> explain select count(*) from categorylinks where cl_collation = '';
+----+-------------+---------------+------+---------------+--------------+---------+-------+--------+--------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+---------------+------+---------------+--------------+---------+-------+--------+--------------------------+

1SIMPLEcategorylinksrefcl_collationcl_collation34const115866Using where; Using index

+----+-------------+---------------+------+---------------+--------------+---------+-------+--------+--------------------------+
1 row in set (0.00 sec)

Running updateCollation.php against commonswiki currently...

Reedy added a comment.Via ConduitJul 6 2012, 4:41 PM

mysql> explain select count(*) from categorylinks where cl_collation != 'uppercase';
+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+--------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+--------------------------+

1SIMPLEcategorylinksrangecl_collationcl_collation34NULL60341Using where; Using index

+----+-------------+---------------+-------+---------------+--------------+---------+------+-------+--------------------------+
1 row in set (0.02 sec)

Reedy added a comment.Via ConduitJul 6 2012, 4:54 PM

Enwiki is clean now.

Running it via foreachwiki, seems most wikis are clean, but not all, just noticed cawiki and dewiki weren't (for a couple of examples)

Reedy added a comment.Via ConduitJul 6 2012, 5:00 PM

Doned

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.