Page MenuHomePhabricator

Media Viewer embed code breaks links to Wikipedias
Closed, ResolvedPublic

Description

Please have a look at the embed code at https://commons.wikimedia.org/wiki/File:Hh-strafjustizgebaeude-justitia.jpg#/media/File:Hh-strafjustizgebaeude-justitia.jpg. The links to German Wikipedia do not work, because the scheme ("http:") is missing. This makes reusers unable to embed this code easily, especially since this is a fairly subtle bug.

Event Timeline

Srittau created this task.Jun 18 2016, 3:29 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 18 2016, 3:29 PM
Restricted Application added a subscriber: Matanya. · View Herald TranscriptJun 18 2016, 4:52 PM

Thanks for reporting this. Please provide steps to reproduce (how exactly to "have a look at the embed code"). I don't find any "embed" in the source of that page that you linked to.

Gunnex added a subscriber: Gunnex.Jun 18 2016, 7:09 PM

Related: Commons:Forum#Ich geb nicht auf und frag nochmal: Kann mir jemand erklären, was der Mist soll? (in German)

The html-embedding-code for this file from the MediaViewer is:

<p><a href="https://commons.wikimedia.org/wiki/File:Hh-strafjustizgebaeude-justitia.jpg#/media/File:Hh-strafjustizgebaeude-justitia.jpg"><img alt="Hh-strafjustizgebaeude-justitia.jpg" src="https://upload.wikimedia.org/wikipedia/commons/c/ce/Hh-strafjustizgebaeude-justitia.jpg" height="514" width="787"></a><br>Von <a href="de.wikipedia.org/wiki/Benutzer:Staro1" class="extiw" title="de:Benutzer:Staro1">Staro1</a> - Von <a href="de.wikipedia.org/wiki/Benutzer:Staro1" class="extiw" title="de:Benutzer:Staro1">Staro1</a> in deutschsprachige Wikipedia geladen., <a title="Creative Commons Attribution-Share Alike 3.0<p></p>" href="http://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA 3.0</a>, https://commons.wikimedia.org/w/index.php?curid=2424938</p>

If you check http://www.webcitation.org/6iMbcm2bn (an archive of https://archivalia.hypotheses.org/57278) where the code was copy&pasted, it appears that (following @Raymond at Commons Forum) the

<p></p>

(right after "(...) Share Alike 3.0") breaks the visual, creating an eventual redundant, empty paragraph.

But, if you copy&pasted the code into a html-tester like http://www.w3schools.com/html/tryit.asp?filename=tryhtml_intro or http://www.play-hookey.com/htmltest/ it looks fine...

The missing

http

appears to be unrelated, as the code & links are working at above test pages.

I created HTML embed codes via MediaViewer for a bunch of Commons images. It looks like the superflous

<p></p>

appears only for images that uses Template:Cc-by-sa-3.0-migrated. Reason still unclear.

Raymond added a subscriber: Tgr.Jun 19 2016, 11:17 AM

Yeah, great! Thanks to @Tgr for the weekend work :-)

Tgr added a comment.Jun 19 2016, 11:18 AM

Having protocol-relative links (starting with //) means that when the reader clicks them, whatever protocol they are using gets prepended. That's not ideal but should work as Commons is reachable through both protocols.

The <p></p> was provided by the template (fixed, although I am not sure why a single newline did that in the first place) but MV should sanitize it and use entities instead of <>.

FWIW, curid is the page id, not the revision id (which would be oldid), so it always links to the latest version of the page and is only used to make the URL shorter. Not sure if it makes sense to use it for HTML descriptions (it was mainly meant for plaintext which used to be horrendously long) but at least it should be linked.

Change 295120 had a related patch set uploaded (by Gergő Tisza):
Filter HTML from some attributes

https://gerrit.wikimedia.org/r/295120

Change 295126 had a related patch set uploaded (by Gergő Tisza):
Make embed text short URL into a link in HTML mode

https://gerrit.wikimedia.org/r/295126

Tgr added a comment.Jun 19 2016, 8:08 PM

The protocol-relative URLs are from markup like [[:de:Benutzer:Staro1|Staro1]] which the parser turns into <a href="//de.wikipedia.org/wiki/Benutzer:Staro1" class="extiw" title="de:Benutzer:Staro1">Staro1</a>. There are several ways to fix this:

Tgr added a comment.EditedJun 19 2016, 8:41 PM

The last remaining issue with sanitization seems to be a jQuery bug:

var $x = $('<span><a href="http://example.com">x</a></span>');
$x.find('a').prop('title', 'a<p>b</p>c');
$x.html() // <a href="http://example.com" title="a<p>b</p>c">x</a>

Upstreamed as https://github.com/jquery/jquery/issues/3186

Change 295120 merged by jenkins-bot:
Filter HTML from some attributes

https://gerrit.wikimedia.org/r/295120

Please remove this tag when you have addressed the minor comment.

Change 295126 merged by jenkins-bot:
Make embed text short URL into a link in HTML mode

https://gerrit.wikimedia.org/r/295126

I think this is fixed. @Tgr can you confirm?

Tgr closed this task as Resolved.Aug 15 2016, 9:38 PM
Tgr claimed this task.

Yup, fixed.