Page MenuHomePhabricator

"Download as PDF" contains broken licence link
Open, LowPublic

Description

Author: mediawiki-bugs

Description:
When exporting a wiki page as PDF, the PDF file contains a section called "License". At least for pages under a Creative Commons License, this includes an URL to the text of the license. This URL is plain text (not hyperlinked) and contains no protocol specifier (//creativecommons.org/licenses/by-sa/3.0/
instead of http://creativecommons.org/licenses/by-sa/3.0/).

While this may be an instance of protocol neutral links, I think that it would be more helpful if these were complete links (starting with either http:// or https://). Browsers generally seem to interpret links starting with a slash as file:// links (at least on Linux).

How to reproduce:

  1. Go to http://www.mediawiki.org/ or http://en.wikipedia.org/.
  2. In the sidebar, click "Print/export" and then "Download as PDF"
  3. Wait for the rendering to finish and download the PDF
  4. Open the PDF file, scroll to the last page and look at the "License" section

Version: unspecified
Severity: normal

Details

Reference
bz41273

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:07 AM
bzimport added a project: Collection.
bzimport set Reference to bz41273.
bzimport added a subscriber: Unknown Object (MLST).

Protip for whoever implements this:

It should basically be an instance of finding any links prefixed with "//" in the page and replacing it with a proper link. It shouldn't be all too difficult.

If one of the Collection devs agrees, I'd like to add the "easy" keyword here.

volker.haas wrote:

The license text is configurable - therefore I propose to change the text to include a protocol for the URL.

Currently the link is not even detected being a link (b/c of the missing protocol) therefore I can't simply add the protocol to the link. And I am reluctant to use a regex on all of the license text and replace stuff that looks like a link without a protocol.

This is because by default mw-rights-url is used. If the system can't handle protocol relative urls, just put it trough wfExpandUrl( url, PROTO_CURRENT ), before outputting;

Or actually, due to possible lack of protocol awareness in caching layer of PDF rendered documents, it should probably be wfExpandUrl( url, PROTO_HTTP );

This makes me wonder however about the protocol relative capabilities of the rest of the renderer. We have these things all over the place.

Patch abandoned only because followup questions were not answered by reviewers, e.g. "WHICH configuration variable is used to import that fragment".

  • Bug 46891 has been marked as a duplicate of this bug. ***
Krenair set Security to None.
Krenair added a subscriber: Krenair.

I've checked that given link is proper.

I want to fix this bug by coding so someone help me please.

I want to fix this bug by coding so someone help me please.

@PuriDilip: Thanks for your interest! Do you have a specific question we can help with? Because I guess that "Run the code locally, find the right place in the code, change code to implement line breaks, test your code changes" is not exactly the answer you expect? :)
For general information, see https://www.mediawiki.org/wiki/How_to_become_a_MediaWiki_hacker

Aklapper added a subscriber: PuriDilip.

@PuriDilip: I am resetting the assignee of this task because there has not been progress lately (please correct me if I am wrong!).
Resetting the assignee avoids the impression that somebody is already working on this task. It also allows others to potentially work towards fixing this task.
Please claim this task again when you plan to work on it (via Add Action...Assign / Claim in the dropdown menu) - it would be welcome! Thanks for your understanding!