Page MenuHomePhabricator

Linter false positive: ref name containing pipe detected as bogus image option
Open, MediumPublicBUG REPORT

Description

Steps to Reproduce:
Inside a File: caption, create a ref tag with a name containing pipes, e.g. <ref name="foo|bar"> or <ref name=foo|bar>

See test cases:
https://en.wikipedia.org/w/index.php?title=User:Jonesey95/sandbox&oldid=1016691750
https://en.wikipedia.org/w/index.php?title=User:Jonesey95/sandbox&oldid=1016692017

Actual Results:
Linter shows a bogus image options error.

Expected Results:
Linter should not show a bogus image options error. It should ignore pipes and other text inside of ref names.

Event Timeline

Arlolra triaged this task as Medium priority.
Arlolra added a project: Parsoid.
Arlolra added a subscriber: Arlolra.

This is problem with Parsoid's tokenizer,

0-[peg]        | ---->   [{"type":"SelfclosingTagTk","name":"wikilink","attribs":[{"k":"href","v":["File:Meso2mil-English.JPG"],"srcOffsets":[2,2,2,27],"vsrc":"File:Meso2mil-English.JPG"},{"k":"mw:maybeContent","v":["thumb"],"srcOffsets":[28,28,28,33],"vsrc":"thumb"},{"k":"mw:maybeContent","v":["Beginning of image caption text.<ref name=\"foo"],"srcOffsets":[34,34,34,80],"vsrc":"Beginning of image caption text.<ref name=\"foo"},{"k":"mw:maybeContent","v":["bar\">Foo"],"srcOffsets":[81,81,81,89],"vsrc":"bar\">Foo"},{"k":"mw:maybeContent","v":["bar ref text",{"type":"EndTagTk","name":"ref","attribs":[],"dataAttribs":{"tsr":[102,108],"stx":"html"}}," Rest of text."],"srcOffsets":[90,90,90,122],"vsrc":"bar ref text</ref> Rest of text."}],"dataAttribs":{"tsr":[0,124],"src":"[[File:Meso2mil-English.JPG|thumb|Beginning of image caption text.<ref name=\"foo|bar\">Foo|bar ref text</ref> Rest of text.]]"}}]

https://en.wikipedia.org/api/rest_v1/page/html/User:Jonesey95%2Fsandbox/1016692017

Possibly related: table markup inside an image caption causes similar false positives. See https://en.wikipedia.org/w/index.php?title=English_language&oldid=1016445842

I "fixed" this one by using {{!}} for the table markup pipes, but that should not be necessary.

Change 677976 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Don't break on pipe in linkdescs if we're in an ext tag

https://gerrit.wikimedia.org/r/677976

Possibly related: table markup inside an image caption causes similar false positives

Also a tokenizing issue, but requires a different fix.

0-[peg]        | ---->   [{"type":"SelfclosingTagTk","name":"wikilink","attribs":[{"k":"href","v":["File:Test.jpg"],"srcOffsets":[2,2,2,15],"vsrc":"File:Test.jpg"},{"k":"mw:maybeContent","v":["thumb"],"srcOffsets":[16,16,16,21],"vsrc":"thumb"},{"k":"mw:maybeContent","v":["\n{"],"srcOffsets":[22,22,22,24],"vsrc":"\n{"},{"k":"mw:maybeContent","v":["\n"],"srcOffsets":[25,25,25,26],"vsrc":"\n"},{"k":"mw:maybeContent","v":["-\t \n"],"srcOffsets":[27,27,27,31],"vsrc":"-\t \n"},{"k":"mw:maybeContent","v":["\n"],"srcOffsets":[32,32,32,33],"vsrc":"\n"},{"k":"mw:maybeContent","v":["}\n"],"srcOffsets":[34,34,34,36],"vsrc":"}\n"}],"dataAttribs":{"tsr":[0,38],"src":"[[File:Test.jpg|thumb|\n{|\n|-\t \n|\n|}\n]]"}}]

https://en.wikipedia.org/api/rest_v1/page/html/English_language/1016445842#Pluricentric_English

Change 678111 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Handle optional spaces after table_attributes

https://gerrit.wikimedia.org/r/678111