Page MenuHomePhabricator

Missing information template links in templatelinks database
Closed, ResolvedPublic

Description

For several files the mediawiki database templatelinks of WikiCommons does not contain the link to the Information template although it is contained in the wikitext. This problem remains even if a new revision of the page is saved.

Some examples out of several dozens (maybe more):

https://commons.wikimedia.org/wiki/File:Amsterdam_P1080037.JPG
https://commons.wikimedia.org/wiki/File:GLAMcamp_Amsterdam_P1080092.JPG
https://commons.wikimedia.org/wiki/File:2011_Dakar_Rally_Alessandro_Zanotti_Tucuman.JPG
https://commons.wikimedia.org/wiki/File:Wr%C3%B3blik_Szlachecki,_006.jpg

The only thing that seems to be common among these files is that they were uploaded in December 2011.

Related Objects

Event Timeline

Aschroet raised the priority of this task from to Medium.
Aschroet updated the task description. (Show Details)
Aschroet subscribed.

The number of files are in 100s and they are all uploaded around December 2011.

One way to find them would be to look for files missing {{Information}} and {{Infobox template tag}} templates and missing [[Category:Files with no machine-readable author]], [[Category:Files with no machine-readable source]] uploded around that time. A lot of files found that way do have {{Information}} templates. CatStan link: http://tools.wmflabs.org/catscan3/catscan2.php?language=commons&project=wikimedia&depth=1&negcats=Files+with+no+machine-readable+author%0D%0AFiles+with+no+machine-readable+source&ns%5B6%5D=1&templates_any=License+template+tag&templates_no=Information%0D%0AInfobox+template+tag&before=20120201&after=20111101&only_new=1&doit=1

Aklapper lowered the priority of this task from Medium to Low.Feb 13 2015, 3:21 PM
Aklapper set Security to None.
Umherirrender subscribed.

Where you are looking? At ToolLabs?

Because the api shows the templates:

https://commons.wikimedia.org/w/api.php?titles=File:GLAMcamp_Amsterdam_P1080092.JPG&prop=templates&action=query&tllimit=max

<tl title="Template:CC-Layout" ns="10"/>
<tl title="Template:Cc-by-sa-3.0,2.5,2.0,1.0" ns="10"/>
<tl title="Template:Cc-by-sa-layout" ns="10"/>
<tl title="Template:Description" ns="10"/>
<tl title="Template:Dir" ns="10"/>
<tl title="Template:En" ns="10"/>
<tl title="Template:GFDL" ns="10"/>
<tl title="Template:GNU-Layout" ns="10"/>
<tl title="Template:He" ns="10"/>
<tl title="Template:ISOdate" ns="10"/>
<tl title="Template:Information" ns="10"/>
<tl title="Template:Information/author processing" ns="10"/>
<tl title="Template:Lang" ns="10"/>
<tl title="Template:License migration" ns="10"/>
<tl title="Template:License migration is redundant" ns="10"/>
<tl title="Template:License migration is redundant multiple" ns="10"/>
<tl title="Template:License template tag" ns="10"/>
<tl title="Template:Own" ns="10"/>
<tl title="Template:Parse source" ns="10"/>
<tl title="Template:Self" ns="10"/>
<tl title="Module:Date" ns="828"/>
<tl title="Module:Fallback" ns="828"/>
<tl title="Module:Fallbacklist" ns="828"/>
<tl title="Module:I18n/date" ns="828"/>
<tl title="Module:ISOdate" ns="828"/>
<tl title="Module:TemplatePar" ns="828"/>
<tl title="Module:Yesno" ns="828"/>

Seems to be a replication problem for the ToolLabs database:

MariaDB [commonswiki_p]> select tl_namespace, tl_title from templatelinks where tl_from = 17817633;
+--------------+-----------------------------------------+
| tl_namespace | tl_title                                |
+--------------+-----------------------------------------+
|           10 | CC-Layout                               |
|           10 | Cc-by-sa-3.0,2.5,2.0,1.0                |
|           10 | Cc-by-sa-layout                         |
|           10 | Description                             |
|           10 | Dir                                     |
|           10 | En                                      |
|           10 | GFDL                                    |
|           10 | GNU-Layout                              |
|           10 | He                                      |
|           10 | ISOdate                                 |
|           10 | Lang                                    |
|           10 | License_migration                       |
|           10 | License_migration_is_redundant          |
|           10 | License_migration_is_redundant_multiple |
|           10 | License_template_tag                    |
|           10 | Own                                     |
|           10 | Parse_source                            |
|           10 | Self                                    |
|          828 | Date                                    |
|          828 | Fallback                                |
|          828 | Fallbacklist                            |
|          828 | I18n/date                               |
|          828 | ISOdate                                 |
|          828 | TemplatePar                             |
|          828 | Yesno                                   |
+--------------+-----------------------------------------+
25 rows in set (0.00 sec)

Missing on ToolLabs: Template:Information and Template:Information/author processing

I am looking at inconsistent CatScan results and Aschroet is looking at the database templatelinks

Okay, but Aschroet and CatScan both looking at replicated database, not the live production database and there seems to be a problem with the rows for the Information template. Needs a tool labs database admin to look at.

scfc added subscribers: scfc, Springle.

@Springle, could you take a look at this, please?

Percona Toolkit pt-table-sync is running for s3 (T89689) and s4 here, logging discrepancies. Step one it to fix the data, and step two to figure out how it happened.

This week I found at least one example of a TokuDB table index apparently being out of sync with the data -- ie, the data existed, but wasn't returned unless the index was ignored or rebuilt. There was a bug like this previously but it was fixed. So, how did it reappear...

@Springle, if there is anything that a normal labs user as me could do to solve or even further analyze this issue let me know. Otherwise i can just hope that someone finds a solution.

More than a month ago the ticket has been moved to "In Progress". I wonder if there is any progress on the issue?

@Jarekt, i checked again this issue and did not see this inconsistencies anymore. Could you please double check? If it really disappeared we could close the ticket. Just in case you need to re-run you Bot against these files please do so. Thanks.

Steinsplitter subscribed.

no replag now. seems resolved.