Page MenuHomePhabricator

Magic words are considered to be (non-existing) templates
Closed, ResolvedPublic

Description

Author: romaine.wiki

Description:
Looking at the bottom of a page while editing, normally only the used templates are shown. Today I noticed that (some) magic words are considered to be (non-existing) templates at the bottom of the page were used templates are shown. The page that is linked shows these as red links:

Sjabloon:CONTENTLANGUAGE
Sjabloon:LOCALYEAR
Sjabloon:PAGENAME
Sjabloon:PAGENAMEE


Version: unspecified
Severity: critical
URL: http://nl.wikipedia.org/w/index.php?title=Casorate_Sempione&action=edit#footer

Details

Reference
bz31576

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:49 PM
bzimport set Reference to bz31576.
bzimport added a subscriber: Unknown Object (MLST).

romaine.wiki wrote:

(This is nl-wiki only so far I have seen.)

romaine.wiki wrote:

Also many parserfuncties are shown on the special page for non-exiting templates which are inserted on pages.

See:
http://nl.wikipedia.org/w/index.php?title=Speciaal:GevraagdeSjablonen&limit=600&offset=0

Facts:

  • Re-saving a page from the edit page 'cleans' up the problem.
  • These did not appear before 1.18

Suspicion:

  • Something background/job-queue related doesn't know about magic words where it should and is making templatelinks-entries (not just the chache of the special page, going to the edit page lists them below as well, so they're in templatelinks)

tagging bugs for Marcus to look at

  • Bug 31963 has been marked as a duplicate of this bug. ***

a.d.bergi wrote:

A bot in de.wikipedia has the same problem: he lists unused templates... So these problems also must affect the api.
See http://de.wikipedia.org/w/index.php?title=Wikipedia:WikiProjekt_Vorlagen/Arbeitsliste&oldid=95646169#Nicht existierende Vorlageneinbindung

  • Bug 32171 has been marked as a duplicate of this bug. ***

raising to highest since this is getting reported more.

I've investigated some of the occurrences, and discovered the following:

  • this bug is related to bug 31577: all the wrongly-categorized pages there also had PAGENAME, NAMESPACE, etc. in the template list
  • the current parse results (from the parser cache) actually DON'T list the bad templates, but the templatelinks table does. Purging the page doesn't remove them, but null editing the page does

I've put in a live hack for debugging in r104059 that does the following things:

  • add the server name to the parser cache comment, so we can see which server rendered the cached HTML. This is probably generally useful for debugging and I'll probably put it in MediaWiki as an optional feature later
  • before saving a parse result to the parser cache, check the templates list for PAGENAME, FULLPAGENAME and NAMESPACE. If any of those three are in the templates list, write the details (page name, pcache key, timestamp, server name) to a log file

Hopefully this'll allow me to narrow down this issue. What I'm particularly interested in, meanwhile, is NEW occurrences of this bug, because I want to determine whether these bad template/category entries are still being created. If they aren't, we can just clean up the old ones and be done with it, but if they are, we have a real bug to fix.

I have just finished cleaning up all the weird template entries on nlwiki in a semi-automated way. Basically, what I did was:

  1. Check Special:Wantedtemplates for worst offenders
  2. Generate list of page IDs with bogus templatelinks using a toolserver query
  3. Run RefreshLinks::fixArticleLinks() on each of those page IDs using eval.php
  4. Rebuild Special:Wantedtemplates using updateSpecialPages.php
  5. Go to step 1

When cleaning up this mess on other wikis, I should really generate a list of all magic words and use that in step 2, it should be less labor-intensive that way.

Anyway, all bogus templatelinks entries associated with this bug are now gone on nlwiki. I'll keep an eye on this over the holidays and get back to this next week. Next Wednesday, I will either:
A. find out that no new instances of this bug have popped up on nlwiki (which means that whatever was causing them is now fixed), clean up the other wikis using a script (which I'll have to write first), and call it a day
B. find out that new instances did appear, and investigate them, so I can find out what causes them and fix it

I added some more debugging code, ate dinner, and had 5 megabytes of debugging output waiting for me when I got back. It looks like I've proven Merlissimo's theory that this bug is introduced by the job queue:

Bad template NAMESPACE in page [[François Victor Massena]] timestamp 20111130184349 generated by srv267 callers require_once/require_once/RunJobs::execute/RefreshLinksJob2::run/LinksUpdate::doUpdate/LinksUpdate::doIncrementalUpdate/LinksUpdate::getTemplateInsertions

It looks like it might be just a few servers, I'll investigate.

So what I've found so far is that this is definitely coming from the RefreshLinksJob2 jobs. But I have no idea why. Apparently something is different about the context these jobs run in. Maybe it's RequestContext-related? Does anyone have suggestions?

I'm gonna put my logging hacks behind a $wg setting (see SAL for the name) and disable it to reduce noise, then I'm gonna go to bed. I'll pick this up later (in a week or so) if no one else does. In the meantime, it would be nice if someone more familiar with the parser or RequestContext or MagicWord or whatever it is that's messing things up here could offer up suggestions, try to reproduce locally (this is probably hard) or, if they have shell access, investigate it actively.

I've also written a script that can be used to clean up after this bug, and committed it in r104758.


JOB QUEUE bad templates: NAMESPACE when parsing [[Saliniteit]] on nlwiki by srv286 job for [[Sjabloon:Zijbalk thalassisch water]]

Bad template NAMESPACE in page [[Saliniteit]] on nlwiki timestamp 20111130220718 generated by srv286 callers require_once/require_once/RunJobs::execute/RefreshLinksJob2::run/LinksUpdate::doUpdate/LinksUpdate::doIncrementalUpdate/LinksUpdate::getTemplateInsertions

#0 LinksUpdate->getTemplateInsertions(Array ([10] => Array ([Aut] => 1,[Bron] => 1,[Bronnen/noten/referenties] => 1,[En] => 1,[Niet_afdrukken] => 1,[Taalaanduiding] => 1,[Zijbalk_thalassisch_water] => 1))) called at [/usr/local/apache/common-local/php-1.18/includes/LinksUpdate.php:153]
#1 LinksUpdate->doIncrementalUpdate() called at [/usr/local/apache/common-local/php-1.18/includes/LinksUpdate.php:111]
#2 LinksUpdate->doUpdate() called at [/usr/local/apache/common-local/php-1.18/includes/job/RefreshLinksJob.php:136]
#3 RefreshLinksJob2->run() called at [/usr/local/apache/common-local/php-1.18/maintenance/runJobs.php:78]
#4 RunJobs->execute() called at [/usr/local/apache/common-local/php-1.18/maintenance/doMaintenance.php:105]
#5 require_once(/usr/local/apache/common-local/php-1.18/maintenance/doMaintenance.php) called at [/usr/local/apache/common-local/php-1.18/maintenance/runJobs.php:108]
#6 require_once(/usr/local/apache/common-local/php-1.18/maintenance/runJobs.php) called at [/usr/local/apache/common-local/multiversion/MWScript.php:72]

(In reply to comment #14)

I'm gonna put my logging hacks behind a $wg setting (see SAL for the name) and
disable it to reduce noise, then I'm gonna go to bed.

Whoops, I forgot to do this before going to bed last night. I've now added $wgEnableBug31576Debugging and set it to false in CommonSettings.php

Merl added a comment.Dec 1 2011, 1:33 PM

The links are inserted as if {{NAMESPACE}} is a page on template namespace 10.

Because of the (possible) related bug 32170 i still think that i10n files are not loaded.
Perhaps it is easier to find the reason why the loaclized image prefix is not removed although the link is regognized as imagelink by parser and not as normal pagelink.

This also affected the e-mail job queue ($wgEnotifUseJobQ = true).

Getting [[:Template:SITENAME]], {{canonicalurl:{{#special:Preferences}}}} and {{canonicalurl:{{#special:EditWatchlist}}}} inside 'Enotif body'

Maybe add SITENAME as a word to detect.

romaine.wiki wrote:

Flooded by parserfunctions/magic words which are considered to be non-existing templates: http://nl.wikipedia.org/wiki/Speciaal:GevraagdeSjablonen

It looks like a LocalisationCache issue to me, not a parser issue.

I've deployed r105964, which should at least stop any more bad links being added to the tracking tables, at the expense of a few refreshlinks jobs being skipped.

I'm not sure what the root cause is, but the English wikis don't seem to be affected, so maybe it has something to do with fallback merging.

For whatever reason, this has stopped. Not closing, but bumping down the priority. Please reraise the priority if it starts happening again. We'll periodically check in on this.

Merl added a comment.Dec 19 2011, 6:39 PM

Rising priority because this happens again on dewiki.
http://de.wikipedia.org/w/index.php?title=Spezial:Linkliste/Vorlage:PAGENAME&namespace=0&limit=500 shows about 100 entries. Same with other variables.

The logs show it was srv159 running jobs with an old copy of MediaWiki. It still had a job runner on it despite it being marked decommissioned. Roan took it out of the mediawiki-installation group a couple of weeks ago.

(In reply to comment #23)

The logs show it was srv159 running jobs with an old copy of MediaWiki. It
still had a job runner on it despite it being marked decommissioned. Roan took
it out of the mediawiki-installation group a couple of weeks ago.

That's strange. When I was first investigating this issue, I saw other servers doing this too.

(In reply to comment #24)

(In reply to comment #23)

The logs show it was srv159 running jobs with an old copy of MediaWiki. It
still had a job runner on it despite it being marked decommissioned. Roan took
it out of the mediawiki-installation group a couple of weeks ago.

That's strange. When I was first investigating this issue, I saw other servers
doing this too.

See http://rt.wikimedia.org/Ticket/Display.html?id=2173

The urgent site issue is fixed, so I'm marking this bug resolved. I have filed bug 33409 for what is apparently the root cause.

neilk wrote:

There are still some live hacks for debugging this, is that intentional?

neilk wrote:

neilk@fenari:/home/w/c/p$ ls -lat includes/parser/ParserCache.php
-rw-rw-r-- 1 catrope wikidev 7694 2011-12-01 13:05 includes/parser/ParserCache.php

neilk@fenari:/home/w/c/p$ ls -lat includes/LinksUpdate.php
-rw-rw-r-- 1 neilk wikidev 25014 2011-12-01 13:09 includes/LinksUpdate.php

neilk@fenari:/home/w/c/p$ ls -lat includes/job/RefreshLinksJob.php
-rw-rw-r-- 1 reedy wikidev 4157 2011-12-01 13:06 includes/job/RefreshLinksJob.php

neilk wrote:

er, never mind... some of these files have been touched since then anyway by other pushes.

a.d.bergi wrote:

The problem occurs again at de.wikipedia. I'm not sure whether this is a 1.19 bug.
Example: http://de.wikipedia.org/wiki/Spezial:Linkliste?target=Vorlage:PAGENAME, but also others (lc:, formatnum:, int:, safesubst:) are listed.

Possible source: every ppframe is linked, whether it is transcluded or not. For example [[de:template:Info ISO-3166-2:??]] is said to be included in [[de:Dublin]], but it is not in the parse tree - if everything would be correctly expanded.

a.d.bergi wrote:

(In reply to comment #30)

The problem occurs again at de.wikipedia. I'm not sure whether this is a 1.19
bug.

As this bug's root issue cause was identified (?) to be bug 33409, should I file a new bug?

(In reply to comment #31)

(In reply to comment #30)

The problem occurs again at de.wikipedia. I'm not sure whether this is a 1.19
bug.

As this bug's root issue cause was identified (?) to be bug 33409, should I
file a new bug?

Your test case may be due to bug 33409 also.

Raising priority since not fixing this could mean people actually create pages like https://de.wikipedia.org/wiki/Vorlage:PAGENAME

a.d.bergi wrote:

why did that belong to the interwiki component?

a.d.bergi wrote:

We are also getting the problem with normal links. [[de:Template:Höhe]] has a maintenance link to [[de:Bitte kein m oder Meter für HÖHE-BEZUG]], which would be output for invalid parameter values of the templates.
Although none of the inclusions produces a link, the backlinks table is filled with it ([[de:Spezial:Linkliste/Bitte kein m oder Meter für HÖHE-BEZUG]])!

P.Copp added a comment.Mar 3 2012, 2:36 PM

The live workaround to stop this (r105964) has been overwritten by the 1.19 deployment. It should be reinstated until bug 33409 is resolved and verified to be the only cause of this problem.

a.d.bergi wrote:

(In reply to comment #36)

bug 33409 verified to be the only cause of this problem.

I can't believe that really. Why should a disrupted parser cache cause the parserOutput to include links, templates, categories (bug 31577), which shouldn't be there if /orderly/ parsed?

P.Copp added a comment.Mar 3 2012, 4:37 PM

(In reply to comment #37)

I can't believe that really. Why should a disrupted parser cache cause the
parserOutput to include links, templates, categories (bug 31577), which
shouldn't be there if /orderly/ parsed?

bug 33409 is not about the parser cache but about the LocalisationCache which when broken can cause the parser to ignore all magic words.

Which doesn't mean there can't be other bugs causing it, but it's very likely that the problem is related to the LocalisationCache in some way or another.

(In reply to comment #36)

The live workaround to stop this (r105964) has been overwritten by the 1.19
deployment. It should be reinstated until bug 33409 is resolved and verified to
be the only cause of this problem.

This has been reinstated now.

Beta16 added a comment.Mar 9 2012, 9:17 AM

In it.wiki [[:w:it:Special:Wantedtemplates]] is full of Magic Words.

If an user perform a dummy edit in a page the problem disappear for that page.

(In reply to comment #40)

In it.wiki [[:w:it:Special:Wantedtemplates]] is full of Magic Words.

If an user perform a dummy edit in a page the problem disappear for that page.

Yeah, we haven't gotten around to rerunning the cleanup script yet. I've asked Sam to do it this week.

(In reply to comment #41)

(In reply to comment #40)

In it.wiki [[:w:it:Special:Wantedtemplates]] is full of Magic Words.

If an user perform a dummy edit in a page the problem disappear for that page.

Yeah, we haven't gotten around to rerunning the cleanup script yet. I've asked
Sam to do it this week.

Great, thanks!

Reedy added a comment.Mar 17 2012, 5:10 PM

Every other wiki than enwiki is now done... Which is still underway

Reedy added a comment.Mar 18 2012, 3:59 PM

That should be them all done now...

This is not resolved. There are still Magic words like "lc:", "uc:" or "fullurl:".

(In addition to these the most part of other records that remain are related to the bug 16112.)

(In reply to comment #45)

This is not resolved. There are still Magic words like "lc:", "uc:" or
"fullurl:".

This is probably because Sam turned off the prefix matching in the fixup script. It would need to be rerun with that code in place and with the infinite loop issue fixed in some other way.

sumanah wrote:

Reassigning to Sam - Sam, Tim outlined how to fix this in bug 33409 . Thanks. If you have any trouble, chat with Tim?

(In reply to comment #47)

(In reply to comment #45)

This is not resolved. There are still Magic words like "lc:", "uc:" or
"fullurl:".

This is probably because Sam turned off the prefix matching in the fixup
script. It would need to be rerun with that code in place and with the infinite
loop issue fixed in some other way.

Originally, yes, but when I finally run it I didn't. I fixed the infinite loop and then added the prefix matching back in again. See r113964

(In reply to comment #48)

Reassigning to Sam - Sam, Tim outlined how to fix this in bug 33409 . Thanks.
If you have any trouble, chat with Tim?

I was asked to run the cleanup script. Not to fix the underlying cause...

Sam is going to take a run at addressing the problems Beta16 listed in comment 45. Bug 33409 looks substantially more complicated; we may still need to foist that on Sam, but that's TBD.

Reedy added a comment.Apr 3 2012, 10:08 PM

(In reply to comment #45)

This is not resolved. There are still Magic words like "lc:", "uc:" or
"fullurl:".

Where?

Beta16 added a comment.Apr 4 2012, 7:27 AM

(In reply to comment #51)

(In reply to comment #45)

This is not resolved. There are still Magic words like "lc:", "uc:" or
"fullurl:".

Where?

As in comment #40 in [[:w:it:Special:Wantedtemplates]] you can see:

  • Template:Ucfirst:calcio‏‎
  • Template:Fullurl:Template:Regionale storico‏‎
  • Template:Ucfirst:sport‏‎
  • Template:Ucfirst:tennis‏‎
  • Template:Ucfirst:Italia‏‎
  • Template:Fullurl:Template:Promozione storico‏‎

and most others

Recent (= starting today about 1:00 UTC) examples from de.wikipedia:
[[de:Caspar Bauhin]] uses Vorlage:FULLPAGENAME, Vorlage:PAGENAME, Vorlage:SEITENNAME, Vorlage:SORTIERUNG:Bauhin, Caspar, Vorlage:Urlencode:, Vorlage:Urlencode:1992, Vorlage:Urlencode:2000

There are more examples where SORTIERUNG/DEFAULTSORT (both localized and canonical versions occur) is treated as a missing template which means that the article is not sorted correctly in categories.

All articles reported to be affected have one thing in common: They all use the widely used [[de:Template:Personendaten]], which was changed yesterday.

Same than comment 36, but now overwritten by the 1.20wmf1 deployment.
(In reply to comment #36)

The live workaround to stop this (r105964) has been overwritten by the 1.19
deployment. It should be reinstated until bug 33409 is resolved and verified to
be the only cause of this problem.

Please reinstate the workaround. Thanks.

sumanah wrote:

Per conversation with RobLa, assigning to Gabriel. Gabriel, can you take care of this this week?

wicke wrote:

(In reply to comment #55)

Per conversation with RobLa, assigning to Gabriel. Gabriel, can you take care
of this this week?

From the comments it appears as if the workaround in r105964 was effective at preventing new false link table entries. This makes it quite likely that bug 33409 is indeed the root cause.

https://bugs.php.net/bug.php?id=60621 saw no progress since Tim opened it in December. As Tim details, fixing this will likely require non-trivial changes to the PHP CDB bindings.

Preventing /tmp from filling up should very likely avoid this problem (and possibly others), but is something more suited to ops.

wicke wrote:

Roan re-applied the live workaround in commit 721b54cdc689c0fb00bd3764c8681e7eca8d781a.

This was also forwarded to 1.20wmf3 to avoid this bug from happening.

In 73bbe9f3a2c53a10523b9e487b33ab9f82a07344 it was (finally) landed in master.

https://gerrit.wikimedia.org/r/#/c/9124/

I'm not sure if there is anything else left to fix this bug.

Reedy added a comment.Jul 31 2012, 8:27 AM
  • Bug 38880 has been marked as a duplicate of this bug. ***

marking fixed per comment 58.