Page MenuHomePhabricator

Update Parsoid to be compatible with magic links being disabled
Closed, ResolvedPublic

Description

Parsoid needs to be updated to be compatible with magic links being disabled (T47942: "Magic links" RFC, PMID and ISBN should be configurable and disableable).

Event Timeline

Change 310453 had a related patch set uploaded (by Legoktm):
API: Expose $wgEnableMagicLinks in meta=siteinfo

https://gerrit.wikimedia.org/r/310453

Change 310453 merged by jenkins-bot:
API: Expose $wgEnableMagicLinks in meta=siteinfo

https://gerrit.wikimedia.org/r/310453

ssastry triaged this task as Medium priority.Sep 20 2016, 4:59 PM
ssastry removed a project: Patch-For-Review.

Are magic links disabled on the Wikimedia cluster?

The magic link configuration is exposed in siteinfo (thanks, @Legoktm!). It was probably not included in the SiteConfig we created during the Parsoid/PHP port, because nothing in Parsoid is/was using it. So we probably need to (1) expose that, (2) add a few feature flag tests to the tokenizer rules which match magic links, (3) add similar feature flag tests to the html2wt serializer rules which protect magic link-like text during serialization (Wikimedia\Parsoid\Html2Wt\ConstrainedText::fromSelSerImpl), and (4) figure out/test a migration strategy for how Parsoid ought to deal with selser when the previously-parsed article/unedited content contains magic links.

I think the behavior we want is:

  • If magic links are enabled:
    • PMID 123 turns into an external link, along the lines of <a href="//www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" rel="mw:ExtLink nofollow" class="external mw-magiclink" id="mwAw">PMID 123</a>
    • <a href="//www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" rel="mw:ExtLink nofollow" class="external mw-magiclink" id="mwAw">PMID 123</a> turns into PMID 123
    • [[pmid:123|PMID 123]] turns into an interwiki link, along the lines of <a rel="mw:WikiLink/Interwiki" href="https://www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" title="pmid:123" class="extiw" id="mwAw">PMID 123</a>
    • <a rel="mw:WikiLink/Interwiki" href="https://www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" title="pmid:123" class="extiw" id="mwAw">PMID 123</a> turns into [[pmid:123|PMID 123]] (per T179769)
      • Currently turns into a magic link, PMID 123.
  • If magic links are disabled:
    • PMID 123 is treated as plain text
      • Currently expanded as a magic link
    • <a href="//www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" rel="mw:ExtLink nofollow" class="external mw-magiclink" id="mwAw">PMID 123</a> turns into either [//www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract PMID 123] or [[pmid:123|PMID 123]]
      • Currently converted to PMID 123, as if it was a magic link
    • [[pmid:123|PMID 123]] turns into an interwiki link, along the lines of <a rel="mw:WikiLink/Interwiki" href="https://www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" title="pmid:123" class="extiw" id="mwAw">PMID 123</a>
    • <a rel="mw:WikiLink/Interwiki" href="https://www.ncbi.nlm.nih.gov/pubmed/123?dopt=Abstract" title="pmid:123" class="extiw" id="mwAw">PMID 123</a> turns into [[pmid:123|PMID 123]]
      • Currently turns into a magic link, PMID 123.

@Legoktm enwiki has interwiki mappings for RFC [[rfc:123]] and PMID [[pmid:1234]] but not ISBN. Is the principal you're proposing above that html2wt should *always* prefer an interwiki mapping, if it exists, over a magic link serialization (whether magic links are enabled or not)? I can see why you'd want this, although strictly speaking it breaks round-tripping.

It's hard to test this because the parser test environment doesn't have the interwiki mappings for pmid and rfc.

Change 982922 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/services/parsoid@master] Update Parsoid to be compatible with magic links being disabled

https://gerrit.wikimedia.org/r/982922

Change 983281 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] ParserTestRunner: add [[pmid:]] interwiki prefix

https://gerrit.wikimedia.org/r/983281

Change 1004200 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] [Parsoid\Config\SiteConfig] enable Parsoid support for disabling magic links

https://gerrit.wikimedia.org/r/1004200

Change 983281 merged by jenkins-bot:

[mediawiki/core@master] ParserTestRunner: add [[pmid:]] interwiki prefix

https://gerrit.wikimedia.org/r/983281

Change 1004200 merged by jenkins-bot:

[mediawiki/core@master] [Parsoid\Config\SiteConfig] enable Parsoid support for disabling magic links

https://gerrit.wikimedia.org/r/1004200

Change #982922 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Update Parsoid to be compatible with magic links being disabled

https://gerrit.wikimedia.org/r/982922

Change #1076829 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a23

https://gerrit.wikimedia.org/r/1076829

Change #1076829 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a23

https://gerrit.wikimedia.org/r/1076829

cscott claimed this task.