Page MenuHomePhabricator

Misinterpreting multiple templates within ref tag
Closed, ResolvedPublic

Description

If ref tag contains a cite template followed by dead link template, the dead link template is treated as part of the cite.

for example (from English Wikipedia Kingston Fossil Plant coal fly ash slurry spill), Reference number 9, name is commondreams, title is Flood of sludge breaks TVA dike

<ref name=commondreams>{{cite news|url=http://www.tennessean.com/article/20081223/GREEN02/812230370/1001/RSS6001|title=Flood of sludge breaks TVA dike|publisher=[[The Tennessean]]|date=December 23, 2008|accessdate=2009-01-09|authors=Anne Paine and Sledge, Colby}} {{Dead link|date=September 2010|bot=H3llBot}}</ref>

authors (Last Name) is interpreted as "Anne Paine and Sledge, Colby}} {{Dead link". The dead link date is interpreted as the "Source date" of the cite template.

My guess is that you should take the contents of the ref tag as a list of templates, one of which will be the cite with the others being modifiers of the reference.

Event Timeline

Sophivorus triaged this task as Medium priority.Oct 22 2016, 1:33 AM
Iniquity raised the priority of this task from Medium to High.Oct 24 2016, 10:22 AM

Yes, I have the same problem with DOI template, check this (sources 37, 39, 64 etc).

Change 318126 had a related patch set uploaded (by Sophivorus):
Support for nested templates

https://gerrit.wikimedia.org/r/318126

Change 318126 merged by Sophivorus:
Support for nested templates

https://gerrit.wikimedia.org/r/318126

Ow man, I misinterpreted the task, I thought the problem was for cases like:

{{Cite book
|param1 = value2
|param2 = [[Some|link]]
|param3 = {{Subtemplate |foo |bar=baz}}
}}

Anyway, these cases are fixed now. I'll tackle the reported issue next time...

@Sophivorus thanks! we are waiting for! :)

I looked into this issue and it seems that the solution requires moving away from regexes, see http://stackoverflow.com/questions/546433/regular-expression-to-match-outer-brackets

@Sophivorus I think old proveit hasnt this problem, and it works with regex. May we steal smth from it?

Well, according to this link for regex recursion referred to in the stackoverflow discussion, regex might still be the answer, if the Wiki software implements it.

I am confident that you will figure out a reliable algorithm to determine the true cite template contents if there is one.

If you have that, then you could treat everything before the cite template as "prefix" and everything following it as "suffix". Handle the cite tag, then when writing it back, write the prefix, the modified cite tag and the suffix.

It would be nice if you could specifically handle the dead link template, but I realize that it is probably specific to the English Wikipedia, and so not in keeping with your goal of a more universal tool.

Many thanks,
John

Theres a problem in general not with the gadget itself it appears to also affect {{cite}} templates and/or reflist templates

If i may add this was reported from enwiki user in enwiki help irc (no logs due to disallowed public logging)

This comment was removed by Iniquity.

Change 318975 had a related patch set uploaded (by Sophivorus):
Support for multiple templates

https://gerrit.wikimedia.org/r/318975

@Arg342 The regex example you linked is for PHP, JavaScript doesn't support the recursive "R" flag I'm afraid.

@Iniquity I checked the old ProveIt code and did some testing, and I think that the reason why it works there is because it doesn't support nested templates.

@Zppix What problem?

Sophivorus claimed this task.

@Sophivorus wow, awesome news! Thanks for fix:) About old ProveIt, so you want to say that old ProveIt doesnt work with nested templates? It kicks them from sources?

Change 318975 merged by Sophivorus:
Support for multiple templates

https://gerrit.wikimedia.org/r/318975

@Iniquity Yes, I mean that old ProveIt interprets {{Cite book |name={{Joe}}}} as {{Cite book |name={{Joe}} (closes the template on the subtemplate).

@Sophivorus oh, right, thanks for answer :)