Page MenuHomePhabricator

Parsoid doesn't give external links class="external free|text"
Closed, ResolvedPublic

Description

The https://commons.wikimedia.org/wiki/File:Icon_External_Link.svg icon should appear to the right of external links, but isn't currently.


Version: unspecified
Severity: normal
See Also:
T59451: Flow: External links without a name, are invisible on preview

Details

Reference
bz56756

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 2:18 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz56756.
Quiddity created this task.Nov 8 2013, 1:46 AM

The WMF core features team tracks this bug on Mingle card https://mingle.corp.wikimedia.org/projects/flow/cards/427, but people from the community are welcome to contribute here and in Gerrit.

Parsoid knows it's an external link but doesn't add the class "external text", or rel="nofollow" that a link in article HTML has.

spage@ee-flow:~$ curl -d 'wt=[http://example.com external]' -d 'body=1' localhost:8000/localhost/Main_Page

...
<a rel="mw:ExtLink" href="http://example.com" data-parsoid='{"targetOff":20,"contentOffsets":[20,28],"dsr":[0,29,20,1]}'>external</a>

compared with wikitext HTML:

<a href="http://example.com" class="external text" rel="nofollow">external</a>

The class does not seem to be needed in Parsoid output as the typeof attribute can be used instead. Creating updated styles is bug 51245.

rel=nofollow is bug 52617.

Change 118447 had a related patch set uploaded by Matthias Mullie:
Add selectors for Parsoid-output external links

https://gerrit.wikimedia.org/r/118447

Change 118448 had a related patch set uploaded by Matthias Mullie:
Add selector for Parsoid-output external links

https://gerrit.wikimedia.org/r/118448

Change 118449 had a related patch set uploaded by Matthias Mullie:
Add selectors for Parsoid-output external links

https://gerrit.wikimedia.org/r/118449

Change 118450 had a related patch set uploaded by Matthias Mullie:
Add selector for Parsoid-output external links

https://gerrit.wikimedia.org/r/118450

Change 118451 had a related patch set uploaded by Matthias Mullie:
Add selector for Parsoid-output external links

https://gerrit.wikimedia.org/r/118451

Change 118452 had a related patch set uploaded by Matthias Mullie:
Add selectors for Parsoid-output external links

https://gerrit.wikimedia.org/r/118452

There's a couple of ways we could go about this:

  • Parsoid could add class="external" to external links
  • Skins' CSS could target [rel="mw:ExtLink"] too, like it currently only targets .external
  • Flow could hack around this, finding [rel="mw:ExtLink"] links and adding class="external"

I would prefer not to do #3. I've uploaded a couple of patches that should do #2 in all skins & extensions that currently only target .external. Not sure if that's going to be an approach everyone will agree with. If it is, I'll do the same for rel="nofollow"

Shouldn't Parsoid simply use the external link class so we can turn that into a defacto standard? Is there any reason it can't?

I noticed "trying to generalize CSS / JS integration for Flow, VE etc" under "Parsoid" in the notes of the last scrum of scrums; I assume it's related to this bug.

https://www.mediawiki.org/wiki/Scrum_of_scrums/2014-03-19

Change 118447 abandoned by Matthias Mullie:
Add selectors for Parsoid-output external links

Reason:
Looks like Parsoid is working on this

https://gerrit.wikimedia.org/r/118447

Change 118448 abandoned by Matthias Mullie:
Add selector for Parsoid-output external links

Reason:
Looks like Parsoid is working on this

https://gerrit.wikimedia.org/r/118448

Change 118449 abandoned by Matthias Mullie:
Add selectors for Parsoid-output external links

Reason:
Looks like Parsoid is working on this

https://gerrit.wikimedia.org/r/118449

Change 118450 abandoned by Matthias Mullie:
Add selector for Parsoid-output external links

Reason:
Looks like Parsoid is working on this

https://gerrit.wikimedia.org/r/118450

Change 118451 abandoned by Matthias Mullie:
Add selector for Parsoid-output external links

Reason:
Looks like Parsoid is working on this

https://gerrit.wikimedia.org/r/118451

Change 118452 abandoned by Matthias Mullie:
Add selectors for Parsoid-output external links

Reason:
Looks like Parsoid is working on this

https://gerrit.wikimedia.org/r/118452

I split 'adding rel="nofollow" to links' into bug 66289.

Parsoid's CSS module in core (resources/src/mediawiki.skinning/content.parsoid.less) should output similar CSS to the changes in these abandoned patches, i.e. style

a[rel="mw:ExtLink"]

appropriately.

Note the Vector skin no longer styles different URL formats (mailto, ftp) different) and file types (.mp3, .pdf) differently, but other skins like ?useskin=monobook do.

There's another wrinkle to external links.
Parser also outputs CSS class="free" if it's a bare external link in text (e.g. ... http://example.com ...) and outputs CSS class="text" if it's an external link in wikitext markup (e.g. ... [http://example.com foo] ...). Parsoid doesn't output these classes, though the information to distinguish them be in data-parsoid: {stx:"url"}.

skins/common/commonPrint.css styles .text , and skins/common/shared.css styles .free.

Arlolra removed GWicke as the assignee of this task.Nov 25 2014, 8:03 PM
Arlolra set Security to None.
Esanders renamed this task from Flow: Parsoid doesn't give external links class="external free|text" to Parsoid doesn't give external links class="external free|text".Mar 3 2017, 10:39 PM
Esanders added a subscriber: Esanders.

VE also uses Parsoid HTML in a few places for previewing, so this problem is not limited to Flow.

Liuxinyu970226 added a subscriber: Aklapper.
Aschroet added a subscriber: Aschroet.EditedMar 20 2017, 7:09 AM

@Esanders, before opening another bug, i'd like to ask if this bug here could also be the reason why external links without names are not displayed in Wikipedia Android App.

Okay, so, here is an update after visual diff testing where I use the following CSS for adding the external links:

.mw-body a[rel="mw:ExtLink"],
.mw-body a[href^="http"] {
    background: url("https://en.wikipedia.org/w/skins/Vector/images/external-link-ltr-icon.png") no-repeat right;
    padding-right: 13px;
    color: #36b;
}

However, this CSS doesn't distinguish between interwiki links and links to other wikis since Parsoid HTML adds a mw:ExtLink annotation there as well. For example here is a snippet form a jawiki page with a link to an enwiki page.

<a rel="mw:ExtLink" href="//en.wikipedia.org/wiki/Stoney%20Creek%20Bridge" title="en:Stoney Creek Bridge">英語版</a>)

So, on the Parsoid end, we have to figure out if we need a different rel annotation for interwiki links OR if we should simply bite the bullet and add the relevant CSS classes to Parsoid output.

@Esanders, before opening another bug, i'd like to ask if this bug here could also be the reason why external links without names are not displayed in Wikipedia Android App.

That sounds like a distinct issue (not loading the CSS for the Parsoid HTML representation of external links). Could you open a new task?

Okay, so, here is an update after visual diff testing where I use the following CSS for adding the external links:

.mw-body a[rel="mw:ExtLink"],
.mw-body a[href^="http"] {
    background: url("https://en.wikipedia.org/w/skins/Vector/images/external-link-ltr-icon.png") no-repeat right;
    padding-right: 13px;
    color: #36b;
}

However, this CSS doesn't distinguish between interwiki links and links to other wikis since Parsoid HTML adds a mw:ExtLink annotation there as well. For example here is a snippet form a jawiki page with a link to an enwiki page.

<a rel="mw:ExtLink" href="//en.wikipedia.org/wiki/Stoney%20Creek%20Bridge" title="en:Stoney Creek Bridge">英語版</a>)

So, on the Parsoid end, we have to figure out if we need a different rel annotation for interwiki links OR if we should simply bite the bullet and add the relevant CSS classes to Parsoid output.

The relevant selector for WMF would be pretty long – meta/commons/incubator/species for wikimedia.org plus Wikipedia/Wiktionary/Wikiquote/Wikibooks/Wikisource/Wikinews/Wikiversity/Wikivoyage/Wikidata with a not: of the current domain is 15 selectors, and that doesn't support random domains like Wikimania…

Okay, so, here is an update after visual diff testing where I use the following CSS for adding the external links:
..
So, on the Parsoid end, we have to figure out if we need a different rel annotation for interwiki links OR if we should simply bite the bullet and add the relevant CSS classes to Parsoid output.

The relevant selector for WMF would be pretty long – meta/commons/incubator/species for wikimedia.org plus Wikipedia/Wiktionary/Wikiquote/Wikibooks/Wikisource/Wikinews/Wikiversity/Wikivoyage/Wikidata with a not: of the current domain is 15 selectors, and that doesn't support random domains like Wikimania…

Parsoid already knows when something is an interwiki link and could simply add a rel="mw:InterwikiLink" for example and the mw:ExtLink CSS we are using now will truly be only for ext links.

Separately, there is also a question of ISBN, PMID, etc. magic links .. and how they are styled.

I suppose the real question is: if the mw:Interwiki, mw:ISBN, etc. annotations are truly useful in Parsoid HTML. But, I know opinions are divided on this. For example, Gabriel has been arguing for removing link annotations altogether in which case we should definitely had these external, etc. CSS classes. But, I think there is semantic value in these rel=".." annotations and CSS could be build on top of that.

Parsoid already knows when something is an interwiki link and could simply add a rel="mw:InterwikiLink" for example and the mw:ExtLink CSS we are using now will truly be only for ext links.

Hmm. I guess we don't add new interwiki targets that often so it'd be OK for the content to mis-label things as mw:InterwikiLink when they should be mw:ExtLink and so on, but it would be nice to keep this logic separate as render-time rather than parse-time data. Doing something like:

.mw-body a[rel="mw:ExtLink"][href*=".wikipedia.org/" i],
.mw-body a[rel="mw:ExtLink"][href*=".wiktionary.org/" i],
.mw-body a[rel="mw:ExtLink"][href*=".wikivoyage.org/" i],
.mw-body a[rel="mw:ExtLink"][href*="commons.wikimedia.org/" i],
.mw-body a[rel="mw:ExtLink"][href*="meta.wikimedia.org/" i],
{
   
}

Separately, there is also a question of ISBN, PMID, etc. magic links .. and how they are styled.

Magic links are just styled like internal links currently, which I think is fine.

Aschroet removed a subscriber: Aschroet.Apr 6 2017, 10:55 AM

Also note that mwgrep indicates 724 uses of the a.external selector on wikis, including in JS files, so changing this selector in the new parser would create a large piece of technical debt.

Also note that mwgrep indicates 724 uses of the a.external selector on wikis, including in JS files, so changing this selector in the new parser would create a large piece of technical debt.

Ya, let us just fix this in Parsoid. No reason to add more egregious differences.

ssastry assigned this task to Sbailey.Jan 18 2018, 5:48 PM

Change 407053 had a related patch set uploaded (by Sbailey; owner: Sbailey):
[mediawiki/services/parsoid@master] T58756 external links class= now setting free, text and autonumber

https://gerrit.wikimedia.org/r/407053

Sbailey closed this task as Resolved.Feb 1 2018, 4:09 PM

So for class = "external free", "external text" and "external autonumber", parsoid now generates the proper class attributes and matches the PHP parser output.
See gerrit:
https://gerrit.wikimedia.org/r/#/c/407053/

Sbailey reopened this task as Open.Feb 1 2018, 4:59 PM

Re-opened until merged and deployed, my bad.

cscott added a subscriber: cscott.Feb 1 2018, 5:20 PM

Repeating a comment from the patch:

I'd *strongly* prefer to see this added as a post-processing pass, instead of directly to the wt2html pass. We should maintain a good discipline about separating out redundant information which is only used for read views or styling, so that (in the future) this can be generated on-demand and we don't waste storage or transmission bandwidth on unnecessary bits.

Change 407053 merged by jenkins-bot:
[mediawiki/services/parsoid@master] T58756: external links class= now setting free, text and autonumber

https://gerrit.wikimedia.org/r/407053

Mentioned in SAL (#wikimedia-operations) [2018-02-28T21:46:38Z] <arlolra> Updated Parsoid to 1415a2a (T58756, T169006)

ssastry closed this task as Resolved.Mar 15 2018, 11:03 PM