Page MenuHomePhabricator

mw:WikiLink with fully resolved url doesn't serialize to the right piped wikilink syntax
Closed, ResolvedPublic

Description

Previously(?) newly-created links (without data-parsoid) of either type –

<p><a rel="mw:ExtLink" href="http://en.wikipedia.beta.wmflabs.org/wiki/European_Robin">European Robin</a></p>

… and …

<p><a rel="mw:WikiLink" href="http://en.wikipedia.beta.wmflabs.org/wiki/European_Robin">European Robin</a></p>

… – would both serialise to

[[European Robin]]

but now the latter is serialising to

[http://en.wikipedia.beta.wmflabs.org/wiki/European_Robin European Robin]

Event Timeline

Jdforrester-WMF raised the priority of this task from to Needs Triage.
Jdforrester-WMF updated the task description. (Show Details)
Jdforrester-WMF subscribed.
Jdforrester-WMF set Security to None.
Jdforrester-WMF updated the task description. (Show Details)
ssastry renamed this task from [Regression?] Parsoid no longer converts mw:ExtLink pointing to an internal link into internal links to mw:WikiLink without data-parsoid don't serialize to the right piped wikilink syntax.Apr 1 2015, 7:04 PM
ssastry renamed this task from mw:WikiLink without data-parsoid don't serialize to the right piped wikilink syntax to mw:WikiLink with fully resolved url doesn't serialize to the right piped wikilink syntax.Apr 1 2015, 7:08 PM
ssastry subscribed.
[subbu@earth lib] cat /tmp/html
<a rel="mw:WikiLink" href="./Foo" title="Foo" data-parsoid='{"stx":"piped","a":{"href":"./Foo"},"sa":{"href":"Foo"},"dsr":[0,11,6,2]}'>Bar</a>
<a rel="mw:WikiLink" href="./Foo" title="Foo">Bar</a>
<a rel="mw:WikiLink" href="http://en.wikipedia.org/wiki/Foo" title="Foo">Bar</a>
<a rel="mw:ExtLink" href="http://en.wikipedia.org/wiki/Foo" title="Foo">Bar</a>
[subbu@earth lib] node parse --html2wt < /tmp/html
[[Foo|Bar]]
[[Foo|Bar]]
[[http://en.wikipedia.org/wiki/Foo|Bar]]
[[Foo|Bar]]
ssastry triaged this task as Medium priority.Apr 1 2015, 10:50 PM
ssastry moved this task from Needs Triage to In Progress on the Parsoid board.
Jdforrester-WMF raised the priority of this task from Medium to High.Apr 2 2015, 5:45 PM

@subbu: shoud mw:ExtLink be treated differently from mw:WikiLink, or should they be treated as synonyms?

In your example, it seems like the last two cases are exactly backwards. Maybe there's just a ! missing somewhere...

@subbu: shoud mw:ExtLink be treated differently from mw:WikiLink, or should they be treated as synonyms?

In your example, it seems like the last two cases are exactly backwards. Maybe there's just a ! missing somewhere...

I am not sure why an editor would add a mw:ExtLink annotation on a wikilink, but at this time, I think we treat mw:ExtLink / mw:WikiLink as hints for the most part.

I am not sure if there is a "!" missing .. because all of them are being serialized to wikilink syntax. With a misplaced ! I would have expected an extlink syntax.

Change 218405 had a related patch set uploaded (by Cscott):
T94723: Fix serialization of mw:WikiLink which use absolute URLs

https://gerrit.wikimedia.org/r/218405

See T102556. Our current implementation (with the gerrit patch above) is fragile, since it does not let users actually author an ExtLink to a WMF property if they wanted to. This shows up as selser failures:

[http://en.wikipedia.org/wiki/Foo Bar] round-trips correctly only when unmodified (ie, via selser).

As soon as it is edited (say Bar becomes Baz), it turns into [[Foo|Baz]] (not [http://en.wikipedia.org/wiki/Foo Baz]) which is a selser failure.

The only way around this is to flag the URL as "really an ExtLink, don't make it a Wikilink" somehow. T102556 provides a means to do this, but I'm not sure it's worth it.

Change 218405 merged by jenkins-bot:
T94723: Fix serialization of mw:WikiLink which use absolute URLs

https://gerrit.wikimedia.org/r/218405

Update, using production-deployed parsoid-lb interface:

<p><a rel="mw:ExtLink" href="https://en.wikipedia.org/wiki/European_Robin">European Robin</a></p>

still maps to

[[European Robin]]

… but …

<p><a rel="mw:WikiLink" href="https://en.wikipedia.org/wiki/European_Robin">European Robin</a></p>

now maps to [[https://en.wikipedia.org/wiki/European Robin|European Robin]] which is rather broken. :-(

@Jdforrester-WMF That patch hasn't been deployed yet.

But this looks ok on master,

λ (master) echo '<p><a rel="mw:WikiLink" href="https://en.wikipedia.org/wiki/European_Robin">European Robin</a></p>' | node parse --html2wt
[[European Robin]]

Yes, the broken [[https://en.wikipedia.org/wiki/European Robin|European Robin]] serialization is one of the things this patch specifically fixes. (I believe there's even a test case for it in the patch.)

ssastry removed a project: Patch-For-Review.
ssastry removed a subscriber: gerritbot.

@Jdforrester-WMF That patch hasn't been deployed yet.

But this looks ok on master,

λ (master) echo '<p><a rel="mw:WikiLink" href="https://en.wikipedia.org/wiki/European_Robin">European Robin</a></p>' | node parse --html2wt
[[European Robin]]

Yes, the broken [[https://en.wikipedia.org/wiki/European Robin|European Robin]] serialization is one of the things this patch specifically fixes. (I believe there's even a test case for it in the patch.)

Aha, awesome, thanks both.