Page MenuHomePhabricator

Don't expand URLs for printing except for references
Closed, ResolvedPublic

Description

When playing around with the new Electron service and trying to find some nice examples, I came across some pages that are not working well with the service.

  1. Extension manuals on mediawiki.org: there seems to be a problem with the {{Template:Extension}}. See https://www.mediawiki.org/api/rest_v1/page/pdf/Extension:ElectronPdfService or the screenshot:

electron_extman.jpg (1×751 px, 55 KB)

  1. Articles that are using the OpenStreetMaps plugin. E.g. https://de.wikipedia.org/wiki/Berlin. In the PDF the link to the GeoHack tool will be printed. See https://de.wikipedia.org/api/rest_v1/page/pdf/Berlin or the screenshot:

electron_geohack.jpg (697×808 px, 153 KB)

Update from 18/01/2017: The solution we agreed on is to change the main print CSS to not expand URLs except for the references.

Event Timeline

Those CSS-expanded URLs are often problematic. I tried to at least make them wrappable in https://gerrit.wikimedia.org/r/#/c/286772/, but it seems that at least the extension infobox does not match those rules.

It might be worth considering disabling those URL target expansions entirely, as I doubt that they add enough actual value for users to warrant the issues they cause. The vast majority of links in PDFs are not expanded as text anyway, so we already rely on users following the links in the PDF for most of them.

A good way to debug print CSS is Chrome's "emulate media functionality". In the developer console, open the drawer (top right corner), then "more tools" -> "rendering". Select "emulate CSS media", and pick "print". You can then use the normal live edit & inspection tools on the print view.

For the extension infobox, the culprit seems to be that the infobox explicitly requests textual link expansion by setting the "text" class on the link: <a rel="nofollow" class="external text" href="https://github.com/wikimedia/mediawiki-extensions-ElectronPdfService/archive/master.tar.gz">Git master</a>

Removing the text class avoids adding the expansion, which then avoids breaking the layout.

The geohack issue looks similar: <a class="external text" href="//tools.wmflabs.org/geohack/geohack.php?pagename=Berlin&amp;language=de&amp;params=52.518611111111_N_13.408333333333_E_region:DE-BE_type:city(3520031)"><span title="Breitengrad">52°&nbsp;31′&nbsp;<abbr title="Nord">N</abbr></span>, <span title="Längengrad">13°&nbsp;24′&nbsp;<abbr title="Ost">O</abbr></span></a>

@GWicke thx for the quick patch!
We've just discussed the issue in the team a bit more.
So, if we disable URL target expansion completely, we might remove more URLs than we actually want, e.g. in references. Would that be a problem? It sounds like the easiest and quickest fix possible, so given we don't have much time left if we still want to deploy the extension this year (next week would be the last possible deployment window possible for us AFAIK), we would go for that option if there are no big issues connected with it. What's your opinion on that?

Also somebody from our team mentioned, it might not be enough to change print styles in commonPrint in core, but there are also community specific print styles in each Wiki, that would probably be overwriting rules again. That's something we would need to consider when polishing print styles I guess.

As for the geohack issue. It's strange that the English version is totally looking fine (link text is printed) while the German one is not (URL is printed). As both are using class="external text" the issue is probably something else.

https://en.wikipedia.org/wiki/MediaWiki:Print.css might have some influence on enwiki vs. dewiki behavior. This should also show up in the "applied styles" when inspecting the elements in questions.

@GWicke what do you think about disabling URL target expansions completely? Would it solve our issues and is it feasible to get this deployed next week?

@Tobi_WMDE_SW, I think for screen-optimized PDFs that would make a lot of sense.

Getting it deployed globally might require some discussion, so might be tricky in the short term. If you want to make progress quickly, you could propose or implement a change to MediaWiki:Print.css in the specific projects first.

@Nirzar @Volker_E @atgo might also have input on this.

Hello! I'm not sure what all would be impacted by disabling these URLs... but I imagine you'd want to move carefully on any sort of global change like that, especially if things are already different across wikis. Do you mean just for the PDFs?

Not sure what other input I can offer.

So, dewiki has had a little bit of CSS added to the print css since 2007 that tries to deal with the issue of long URLs hen printing.

https://de.wikipedia.org/wiki/MediaWiki:Print.css

/*
 * Möglichkeit zum Überschreiben der Regel aus mediawiki.legacy/commonPrint.css:
 * [[Wikipedia:Verbesserungsvorschläge/Feature-Requests/Archiv/2007#a:after in der commonPrint.css (erledigt)]]
 * In [[en:MediaWiki:Print.css]] ist eine ähnliche Klasse mit dem Namen
 * „nourlexpansion“ definiert
 */
.mw-body .plainlinks-print a.external.text:after,
.mw-body .plainlinks-print a.external.autonumber:after {
	content: none;
}

This means that templates can add the css class "plainlinks-print" and long external URLS will not be printed.

enwiki has a different approach and tries to word wrap the urls

https://en.wikipedia.org/wiki/MediaWiki:Print.css

/* We don't want very long URLs (that are added to the content in print) to widen the canvas */
#content a.external.text:after,
#content a.external.autonumber:after {
	word-wrap: break-word;
}

It looks like the enwiki CSS here is actually totally pointless as it is already specified in the common print css included in MW https://github.com/wikimedia/mediawiki/blob/master/resources/src/mediawiki.legacy/commonPrint.css#L176

The dewiki method looks good, and rather than applying the same sort of manual CSS hack to all wikis that want it, it may instead make sense to include something in core.

This could either use the existing noprint class (as below) or a different class as in the dewiki example.

.mw-body .noprint a.external.text:after,
.mw-body .noprint a.external.autonumber:after {
	content: none;
}

We had a short discussion with @GWicke, @Addshore, @Raymond, @Bmueller:

We've decided to disable long urls in PDF for all content excluding references. When users are interested in the links, they need to be online anyways to go to the linked pages and an url to the article is already included in the PDF. If there are cases where users still want to have the extended links, they can make that happen by changing their own css/a wikis css.

Change 332914 had a related patch set uploaded (by Addshore):
Only expand URLs for printing in references

https://gerrit.wikimedia.org/r/332914

Addshore moved this task from Unsorted 💣 to Back Burner 🏛️ on the User-Addshore board.
Addshore moved this task from Proposed to Sprint ready on the WMDE-TechWish board.
Addshore moved this task from Sprint ready to Currently in sprint on the WMDE-TechWish board.
Tobi_WMDE_SW renamed this task from Some examples that do not work well with Electron PDF renderer service to Don't expand URLs for printing except for references.Jan 19 2017, 9:35 AM
Tobi_WMDE_SW triaged this task as High priority.
Tobi_WMDE_SW updated the task description. (Show Details)

Change 332914 abandoned by Addshore:
Only expand URLs for printing in references

Reason:
See https://gerrit.wikimedia.org/r/#/c/333125/

https://gerrit.wikimedia.org/r/332914

Change 333125 had a related patch set uploaded (by Addshore):
Allow custom css styles to be loaded

https://gerrit.wikimedia.org/r/333125

Change 334507 had a related patch set uploaded (by Addshore):
Load custom css for requests by electron-render-service

https://gerrit.wikimedia.org/r/334507

Change 334507 merged by jenkins-bot:
Load custom css for requests by electron-render-service

https://gerrit.wikimedia.org/r/334507

Change 333125 abandoned by Addshore:
Allow custom css styles to be loaded

https://gerrit.wikimedia.org/r/333125