Page MenuHomePhabricator

Flow interwiki links garbled in old posts
Closed, ResolvedPublicBUG REPORT

Description

(Another error in the same vein as T383645 - sorry Parsoid team)

Steps to replicate the issue (include links if applicable):

What happens?:

This can (presumably?) be achieved by using [:d:Property:P373 Commons category (P373)] and [:d:Property:P935 Commons gallery (P935)]

What should have happened instead?:

Something that renders as an actual link:

This can (presumably?) be achieved by using [[:d:Property:P373|Commons category (P373)]] and [[:d:Property:P935|Commons gallery (P935)]]

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I don't particularly care whether Parsoid serializes this as an interwiki link or an external link - but do one or the other not some odd mashup.

Could I please get a response from the Parsoid team? Is this something that they can fix, or do I have to resort to lossy post-hoc transforms like I did for the other bug?

I believe the current plan is to archive Flow boards in read-only mode, at which point this would become moot.

The plan is to eventually convert them to wikitext, which involves running the stored HTML back through Parsoid. I'm running https://gitlab.wikimedia.org/pppery/flow-export-with-history to do that.

Annoyingly it can generate text like "[w:operant conditioning chamber like rewards]", which is impossible to interpret unambiguously.

In this case the intent was "[[w:operant conditioning chamber|like rewards]]", but I actually got it wrong on my first attempt.

The relevant Parsoid HTML is:

<a rel="mw:ExtLink" href="//en.wikipedia.org/wiki/operant conditioning chamber" title="w:operant conditioning chamber" data-parsoid="{&quot;stx&quot;:&quot;piped&quot;,&quot;a&quot;:{&quot;href&quot;:&quot;//en.wikipedia.org/wiki/operant conditioning chamber&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;w:operant conditioning chamber&quot;},&quot;isIW&quot;:true,&quot;dsr&quot;:[155,202,33,2]}" class="external">like rewards</a>

I'm not familar enough with Parsoid to work out why it is deserializing in this odd way.

Change #1121672 had a related patch set uploaded (by Pppery; author: Pppery):

[mediawiki/services/parsoid@master] Ignore data-parsoid href for invalid interwiki prefix

https://gerrit.wikimedia.org/r/1121672

Change #1121668 had a related patch set uploaded (by Pppery; author: Pppery):

[mediawiki/services/parsoid@master] Serialize unrecognizable interwiki links as interwiki links

https://gerrit.wikimedia.org/r/1121668

I've submitted two patches to this, to serialize in either direction. Both are three lines, not counting comments.

Which are apparently the first patches to Parsoid by someone without +2 since September. Wow, I didn't know how little volunteer involvement Parsoid had.

Which are apparently the first patches to Parsoid by someone without +2 since September. Wow, I didn't know how little volunteer involvement Parsoid had.

It is a relatively tricky codebase to get into, so I think that's to be expected. Congrats for figuring it out ;)

The number of patches by non-+2-ers is probably comparable to contributions to the old parser. Also, some parts of Parsoid have been moved to extensions now, and I know that there was at least one patch by a new volunteer to Parsoid-Cite integration in January.

Change #1121668 abandoned by Pppery:

[mediawiki/services/parsoid@master] Serialize unrecognizable interwiki links as interwiki links

Reason:

Looks like they prefer the other patch

https://gerrit.wikimedia.org/r/1121668

Change #1121672 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Ignore data-parsoid href for invalid interwiki prefix

https://gerrit.wikimedia.org/r/1121672

Change #1126146 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.21.0-a20

https://gerrit.wikimedia.org/r/1126146

Change #1126146 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.21.0-a20

https://gerrit.wikimedia.org/r/1126146