Page MenuHomePhabricator

Link doesn’t cover full term
Closed, ResolvedPublic

Description

For example, the article Naval Air Station Joint Reserve Base Fort Worth is missing linking on the “s” in “nautical miles”.

The output wikitext is apparently

[[nautical mile]]s

This links the “s” on the website, but it doesn’t on the app.

Event Timeline

Restricted Application added a project: Wikipedia-iOS-App-Backlog. · View Herald TranscriptJun 16 2020, 4:43 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The link is also broken in the Parsoid output that is used by the apps to generate the page so if this is a bug, it's in Parsoid. I would expect the link to say "nautical mile" given that wikitext so not sure what's going on in the default parser.

The link is also broken in the Parsoid output that is used by the apps to generate the page so if this is a bug, it's in Parsoid. I would expect the link to say "nautical mile" given that wikitext so not sure what's going on in the default parser.

That is link trail behavior (wikitext shortcut). Check [[Train]]s in a sandbox. That is a shortcut for [[Train|Trains]].

Anyway, but, this wikitext shows that Parsoid is handling that properly locally.

[subbu@earth:~/work/wmf/parsoid] echo '{{convert|5|NM|0|lk=in}}' | php bin/parse.php --dump tplsrc --normalize=parsoid
[dump/tplsrc] ================================================================================
[dump/tplsrc] TEMPLATE: Template:Convert ; TRANSCLUSION: "{{convert|5|NM|0|lk=in}}"
[dump/tplsrc] --------------------------------------------------------------------------------
[dump/tplsrc] 5 [[nautical mile]]s (9 km; 6 mi)
[dump/tplsrc] --------------------------------------------------------------------------------

<p typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"convert","href":"./Template:Convert"},"params":{"1":{"wt":"5"},"2":{"wt":"NM"},"3":{"wt":"0"},"lk":{"wt":"in"}},"i":0}}]}'>5 <a rel="mw:WikiLink" href="./Nautical_mile" title="Nautical mile">nautical miles</a> (9<span typeof="mw:Entity"> </span>km; 6<span typeof="mw:Entity"> </span>mi)</p>

So, I suspect some other edge case is triggered in the context of the full page.

Something probably broken in Parsoid's linktrail / linkprefix pass and is not a porting bug -- something that predates the port.

[subbu@earth:~/work/wmf/parsoid] echo '{{convert|5|NM|0|lk=in}}.' | node bin/parse.js --normalize

<p>5 <a href="Nautical_mile" title="Nautical mile">nautical mile</a><span>s (9</span> <span>km; 6</span> <span>mi)</span>.</p>

[subbu@earth:~/work/wmf/parsoid] echo '{{convert|5|NM|0|lk=in}}.' | php bin/parse.php --normalize

<p>5 <a href="Nautical_mile" title="Nautical mile">nautical mile</a><span>s (9</span> <span>km; 6</span> <span>mi)</span>.</p>

Looks like the <span> wrapping of text nodes in templated code interferes with link handling code dom pass which doesn't account for span-wrapping. I am quite surprised this bug hasn't been encountered / reported before.

[subbu@earth:~/work/wmf/parsoid] echo -e '{{1x|[[Foo]]bar}}\n\n{{1x|[[Foo]]}}bar' | php bin/parse.php --normalize --dump dom:post-tplwrap
----- DOM: post-tplwrap -----
<body data-parsoid='{"tmp":{},"dsr":[0,37,0,0]}'><p data-parsoid='{"tmp":{"tagId":1},"dsr":[0,17,0,0]}'><a rel="mw:WikiLink" href="./Foo" title="Foo" about="#mwt1" typeof="mw:Transclusion" data-parsoid='{"stx":"simple","a":{"href":"./Foo"},"sa":{"href":"Foo"},"tmp":{"inTransclusion":true,"tagId":3},"dsr":[0,17,null,null],"src":"{{1x|[[Foo]]bar}}","pi":[[{"k":"1"}]]}' data-mw='{"parts":[{"template":{"target":{"wt":"1x","href":"./Template:1x"},"params":{"1":{"wt":"[[Foo]]bar"}},"i":0}}]}'>Foo</a><span about="#mwt1" data-parsoid='{"tmp":{}}'>bar</span></p>

<p data-parsoid='{"tmp":{"tagId":5},"dsr":[19,36,0,0]}'><a rel="mw:WikiLink" href="./Foo" title="Foo" about="#mwt2" typeof="mw:Transclusion" data-parsoid='{"stx":"simple","a":{"href":"./Foo"},"sa":{"href":"Foo"},"tmp":{"inTransclusion":true,"tagId":7},"dsr":[19,33,null,null],"src":"{{1x|[[Foo]]}}","pi":[[{"k":"1"}]]}' data-mw='{"parts":[{"template":{"target":{"wt":"1x","href":"./Template:1x"},"params":{"1":{"wt":"[[Foo]]"}},"i":0}}]}'>Foo</a>bar</p>
</body>

-----------------------------

<p><a href="Foo" title="Foo">Foo</a><span>bar</span></p>
<p><a href="Foo" title="Foo">Foobar</a></p>

Change 607098 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] WIP: Deal with template-span-wrapped text-nodes in handleLinkNeighbors

https://gerrit.wikimedia.org/r/607098

ssastry claimed this task.Jun 23 2020, 4:44 PM
ssastry triaged this task as Medium priority.
ssastry added a project: Parsing-Active-Work.
ssastry moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.

Change 607098 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Fix incorrect handling of tpl-span-wrapped text in HandleLinkNeighbors

https://gerrit.wikimedia.org/r/607098

ssastry closed this task as Resolved.Jul 22 2020, 3:58 PM

Will probably be deployed next week.

Change 616335 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/vendor@master] Bump parsoid to 0.13.0-a2

https://gerrit.wikimedia.org/r/616335

Change 616335 merged by jenkins-bot:
[mediawiki/vendor@master] Bump parsoid to 0.13.0-a2

https://gerrit.wikimedia.org/r/616335