Page MenuHomePhabricator

Unexpected Parsoid results for an unclosed table of templates
Closed, ResolvedPublic

Description

http://en.wikipedia.org/wiki/BRP_Alberto_Navarette_(PG-394)

wikitext:

{|{{Infobox ship begin}}
{{Infobox ship career
| Ship country             =United States
| Ship name                =USCGC ''Point Evans'' (WPB-82354)
...
}}
{|{{Infobox ship begin}}  // note the previous "{|" is unclosed
{{Infobox ship image
| Ship caption             =As BRP ''Alberto Navarette'' (PG-394)
...
}}
{{Infobox ship career
|Ship country=Philippines
|Ship name=BRP ''Alberto Navarette'' (PG-394)
...
}}
{{Infobox ship characteristics
| Ship type                =Patrol Boat (WPB)
| Ship displacement        =60 tons
...
}}
|}

The wikitext contains 5 templates, but only the previous three of them have data-mw counterparts in rdf.

Event Timeline

Bianjiang created this task.Jan 4 2016, 6:04 PM
Bianjiang raised the priority of this task from to Needs Triage.
Bianjiang updated the task description. (Show Details)
Bianjiang added a project: Parsoid.
Bianjiang added a subscriber: Bianjiang.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptJan 4 2016, 6:04 PM
jmadler added a subscriber: jmadler.Jan 6 2016, 5:11 AM
Arlolra triaged this task as Normal priority.Jan 6 2016, 6:30 PM
Arlolra added a subscriber: Arlolra.

Note that a single data-mw can contain multiple templates in its parts. Generally, this happens when multiple templates are used to form a single element. Here's the data-mw containing the missing templates,

data-mw='{"parts":["{|",{"template":{"target":{"wt":"Infobox ship begin","href":"./Template:Infobox_ship_begin"},"params":{},"i":0}},"\n",{"template":{"target":{"wt":"Infobox ship career\n","href":"./Template:Infobox_ship_career"},"params":{"Hide header":{"wt":""},"Ship country":{"wt":"United States"},"Ship flag":{"wt":"{{shipboxflag|United States|coast guard}}"},"Ship name":{"wt":"USCGC ''Point Evans'' (WPB-82354)"},"Ship owner":{"wt":"United States Coast Guard"},"Ship namesake":{"wt":"Point Evans at [[Gig Harbor, Washington]]"},"Ship ordered":{"wt":""},"Ship builder":{"wt":"J.M. Martinac Shipbuilding Corp., Tacoma, Washington"},"Ship laid down":{"wt":""},"Ship launched":{"wt":""},"Ship acquired":{"wt":""},"Ship commissioned":{"wt":"10 January 1967"},"Ship decommissioned":{"wt":"1 December 1999"},"Ship in service":{"wt":""},"Ship out of service":{"wt":""},"Ship struck":{"wt":""},"Ship reinstated":{"wt":""},"Ship honors":{"wt":""},"Ship fate":{"wt":"Transferred to Philippine Navy"},"Ship status":{"wt":""},"Ship notes":{"wt":""}},"i":1}},"\n","{|",{"template":{"target":{"wt":"Infobox ship begin","href":"./Template:Infobox_ship_begin"},"params":{},"i":2}},"\n{{Infobox ship image\n| Ship image               =<!-- Deleted image removed: [[File:BRP Alberto Navarette.jpg|300px|]] -->\n| Ship caption             =As BRP ''Alberto Navarette'' (PG-394)\n}}\n{{Infobox ship career\n|Ship country=Philippines\n|Ship flag={{shipboxflag|Philippines|naval}}\n|Ship name=BRP ''Alberto Navarette'' (PG-394)\n|Ship namesake=Alberto Navarette\n|Ship operator=[[Philippine Navy]]\n|Ship registry=\n|Ship route=\n|Ship awarded=\n|Ship original cost=\n|Ship yard number=\n|Ship way number=\n|Ship laid down=\n|Ship launched=\n|Ship sponsor=\n|Ship christened=\n|Ship completed=\n|Ship acquired=2001\n|Ship commissioned=\n|Ship decommissioned=\n|Ship recommissioned=\n|Ship maiden voyage= \n|Ship in service=\n|Ship out of service=\n|Ship renamed=\n|Ship reclassified=\n|Ship refit=\n|Ship struck=\n|Ship reinstated=\n|Ship homeport=\n|Ship identification=\n|Ship motto=\n|Ship nickname=\n|Ship honours=\n|Ship honors=\n|Ship captured=\n|Ship fate=\n|Ship status={{Ship in active service}}\n|Ship notes=\n|Ship badge=\n}}\n{{Infobox ship characteristics\n| Hide header              =\n| Header caption           =\n| Ship type                =Patrol Boat (WPB)\n| Ship displacement        =60 tons\n| Ship length              ={{convert|82|ft|10|in|m|abbr=on}}\n| Ship beam                ={{convert|17|ft|7|in|m|abbr=on}} max\n| Ship draught             =\n| Ship draft               ={{convert|5|ft|11|in|m|abbr=on}}\n| Ship propulsion          =*2 × {{convert|800|hp|0|abbr=on}} [[Cummins]] [[diesel engine]]s\n*refit in 1990's with 800 hp [[Caterpillar Inc.|Caterpillar]] diesel. \n| Ship speed               ={{convert|18|kn|lk=in}}\n| Ship range               =*{{convert|542|nmi|km|abbr=on}} at {{convert|14.5|kn|abbr=on}}\n*{{convert|1271|nmi|km|abbr=on}} at {{convert|10.7|kn|abbr=on}}\n| Ship complement          =Domestic service : 8 men\n| Ship sensors             =\n| Ship EW                  =\n| Ship armament            =*1967\n*1 × [[Oerlikon 20 mm cannon]]\n\n| Ship armor               =\n| Ship notes               =\n}}\n|}"]}'

Unfortunately, Infobox ship image, the second Infobox ship career, and the second Infobox ship characteristics are being appended as source strings, instead of objects with template keys. Rendering does seem to be correct, and those templates are at least tokenized correctly.

Thanks for the explanation. The problem is we can't get the tokenization results for the last three templates from data-mw, because they're plain strings. Use "Infobox ship image" as an example, from data-mw we can only get

`"{{Infobox ship image\n| Ship image               =<!-- Deleted image removed: [[File:BRP Alberto Navarette.jpg|300px|]] -->\n| Ship caption             =As BRP ''Alberto Navarette'' (PG-394)\n}}"`

Can we get structured results like

{"template":{"target":{"wt":"Infobox ship image", "params":{"Ship caption":{"wt":"As BRP ''Alberto Navarette'' (PG-394)"}

without parsing the wikitext template ourselves?

@Renxiaoyi That's why I left this open. It is indeed a bug. It shouldn't be a string.

Change 263885 had a related patch set uploaded (by Arlolra):
T122816: Consider overlapping tpl ranges as nested

https://gerrit.wikimedia.org/r/263885

Change 263885 merged by jenkins-bot:
T122816: Record when a range is subsumed from overlapping

https://gerrit.wikimedia.org/r/263885

Arlolra set Security to None.
Arlolra closed this task as Resolved.Jan 14 2016, 1:55 AM
Arlolra claimed this task.

The patch fixes the bug but I also edited the article to close the template, which fixes the page before the patch is deployed.

https://en.wikipedia.org/w/index.php?title=BRP_Alberto_Navarette_%28PG-394%29&type=revision&diff=699724675&oldid=688707977