Page MenuHomePhabricator

ForeignAPIRepo wrongly returns non-protocol-relative URLs for original "thumbs"
Open, LowestPublic

Description

It seems that thumbnails of remote files have absulte srcset URLs if the URLs in question link to the original file (th thumbnail is smaller than the original but not 2x smaller). The protocol reflects the protocol used when the page containing the thumbnail was generated.


  1. Visit https://translatewiki.net/wiki/Project:About (HTTPS)
  2. Purge it
  3. Check http://translatewiki.net/wiki/Project:About source

Observed: some hotlinks of Commons files are over HTTPS. (It used to be the opposite, but it seems I can't reproduce by purging on HTTP...)

Expected: $wgUseInstantCommons includes protocol-relative resources and doesn't add insecure content to secure pages.

On closer inspection this is even weirder, because some resources are protocol-relative and some are not:

<img alt="Torchlight kopete.png" src="https://upload.wikimedia.org/wikipedia/commons/5/57/Torchlight_kopete.png" width="135" height="135" />

but

<img alt="P economy blue.png" src="/w/images/thumb/1/1a/P_economy_blue.png/135px-P_economy_blue.png" width="135" height="124" srcset="https://upload.wikimedia.org/wikipedia/commons/1/1a/P_economy_blue.png 1.5x, https://upload.wikimedia.org/wikipedia/commons/1/1a/P_economy_blue.png 2x" />

(I've no idea what "srcset" is).


Version: 1.22.0
Severity: major
URL: https://translatewiki.net/wiki/Thread:Support/Files_from_Commons_loading_over_HTTP
See Also:
T34219: Make $wgUseInstantCommons protocol relative

Details

Reference
bz48133

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:41 AM
bzimport set Reference to bz48133.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

(I've no idea what "srcset" is).

It's this: Gerrit change 24115

(In reply to comment #1)

(In reply to comment #0)

(I've no idea what "srcset" is).

It's this: Gerrit change #24115

So is it a coincidence that, when that fails and no srcset is added, MediaWiki also fails to use the correct (protocol-relative) src?

  • Bug 47653 has been marked as a duplicate of this bug. ***

At first glance, maybe the difference is if it was a thumbnailed file vs if original file.

I cannot reproduce the observations after step 3 any more.

(In reply to comment #5)

I cannot reproduce the observations after step 3 any more.

I can still reproduce that some are protocol-relative and some are not.

(In reply to comment #0)

<img alt="P economy blue.png"
src="/w/images/thumb/1/1a/P_economy_blue.png/135px-P_economy_blue.png"
width="135" height="124"
srcset="https://upload.wikimedia.org/wikipedia/commons/1/1a/P_economy_blue.
png
1.5x, https://upload.wikimedia.org/wikipedia/commons/1/1a/P_economy_blue.png
2x" />

I don't know why we're discussing what protocols are being used - why is the src of this a local resource but the srcset pointing to commons?

After purging, https://translatewiki.net/w/i.php?title=Project:About&oldid=5630868 and going to its "View source" shows twelve links to http://, and zero https:// or // links.
Using my browser's "View Source" I get thirteen links (one being in the footer).

Not sure if I understand the bug report correctly though.

Tgr claimed this task.

Can't reproduce either; assuming this got fixed at some point.

I don't know why we're discussing what protocols are being used - why is the src of this a local resource but the srcset pointing to commons?

The thumbnails are cached locally but the original file is not. Compare

<img alt="Logo sociology.svg" src="/images/thumb/a/a6/Logo_sociology.svg/135px-Logo_sociology.svg.png" width="135" height="135" srcset="/images/thumb/a/a6/Logo_sociology.svg/203px-Logo_sociology.svg.png 1.5x, /images/thumb/a/a6/Logo_sociology.svg/270px-Logo_sociology.svg.png 2x" />

(all local) with

<img alt="P economy blue.png" src="/images/thumb/1/1a/P_economy_blue.png/135px-P_economy_blue.png" width="135" height="124" srcset="https://upload.wikimedia.org/wikipedia/commons/1/1a/P_economy_blue.png 1.5x, https://upload.wikimedia.org/wikipedia/commons/1/1a/P_economy_blue.png 2x" />

(srcset images are too large to be thumbnailed, so they link to the original). This is standard behavior - $wgForeignFileRepos has an option to cache thumbnails (which is set by $wgUseInstantCommons) but never caches originals.

Nemo_bis reopened this task as Open.EditedJan 9 2015, 9:41 PM

This can't be tested on translatewiki.net any longer, because we're now https-only with HSTS, so you won't be able to purge a page over http. Please try reproducing with another wiki before closing.

Aklapper removed Tgr as the assignee of this task.Jan 10 2015, 4:25 PM
Aklapper lowered the priority of this task from Medium to Lowest.
Aklapper added a project: TestMe.
Nemo_bis changed the task status from Open to Stalled.Jan 11 2015, 10:54 AM
Nemo_bis raised the priority of this task from Lowest to Medium.

I cant reproduce this on http(s)://www.omegawiki.org/DefinedMeaning:pink_%286772%29

However it might only occur if the image doesnt need scaling down - as I havent found a scenario for that..

I guess this bug can be closed, since Commons only has HTTPS now, and all links I found on the specified page on Translatewiki are called in HTTPS.

I guess this bug can be closed, since Commons only has HTTPS now

It's still bad to link/embed http though, isn't it?

Tgr lowered the priority of this task from Medium to Lowest.Aug 19 2015, 9:45 PM
Tgr added a subscriber: Tgr.

There can be other foreign repos than Commons. This still seems like a valid bug to me, but super-fringe - it only affects you if you use a high-density display device to visit a page containing a thumbnail embedded from a foreign repo, where the thumbnail is smaller than the original image but larger than half the original, and the page was last generated on HTTP but you are visiting it on HTTPS.

Tgr renamed this task from [InstantCommons] Some hotlinks of Commons files are not protocol-relative to ForeignAPIRepo (InstantCommons) thumbnails have non-protocol-relative srcset URLs in certain cases.Aug 19 2015, 9:51 PM
Tgr updated the task description. (Show Details)

Probably can't be reproduced with Commons images, either, since InstantCommons is now defined with an absolute HTTPS URL.

To clarify, this bug is that when embedding an image from a foreign repo that is smaller than the requested thumbnail size (e.g. the thumb is really big, or the image is relatively small), the API will provide the url to the original image (e.g. it won't make a thumbnail for the same size).

Since ForeignRepo only caches thumbnails, the original is thus served from the original wiki (e.g. Commons), not from the local wiki. It seems the API does not provide protocol-relative urls but absolute urls (based on the protocol the incoming request from the local wiki used). As such, when browsing a wiki that supports both HTTP and HTTPS, it will cache the url to the original image based on the request. So if the last editor/purger of an article used HTTP and you use HTTPS, you're downloading insecure content.

@Krinkle, the current task summary is "ForeignAPIRepo (InstantCommons) thumbnails have non-protocol-relative srcset URLs in certain cases", but from your comment it seems that it should be something more like "ForeignAPIRepo (InstantCommons) original-size files are served directly from source instead of local thumbnail". Can you confirm this is correct?

@brion No, that's T101015. The bug here seems that the API is wrongly using an absolute protocol where a relative protocol is expected. (e.g. it may protocol-relative for thumbs, but not for originals). Though if T101015 is fixed, that likely leads to the removal of the code causing this bug.

Krinkle renamed this task from ForeignAPIRepo (InstantCommons) thumbnails have non-protocol-relative srcset URLs in certain cases to ForeignAPIRepo wrongly returns non-protocol-relative URLs for original "thumbs".Sep 4 2015, 5:34 PM
Krinkle removed a subscriber: wikibugs-l-list.

Since ForeignRepo only caches thumbnails, the original is thus served from the original wiki (e.g. Commons), not from the local wiki. It seems the API does not provide protocol-relative urls but absolute urls (based on the protocol the incoming request from the local wiki used). As such, when browsing a wiki that supports both HTTP and HTTPS, it will cache the url to the original image based on the request. So if the last editor/purger of an article used HTTP and you use HTTPS, you're downloading insecure content.

The URLs in the API response depending on the protocol of the API call is arguably a bug, but how does the protocol of the API call depend on the protocol of the pageview which triggered the API call? ForeignAPIRepo::httpGet() calls wfExpandUrl( $url, PROTO_HTTP ) on the URL, which comes straight from the apibase parameter of the repo configuration. So, if an absolute API URL is configured for the repo, that is honored, otherwise it uses HTTP. Again, not very security-conscious and arguably a bug, but I don't see how it would result in the behavior described here.

So either this got fixed some time in the past or the problem is at some other place (parser cache maybe?).

T49653#526397 seems to be about the same issue (and contains an unmerged patch, too).

The relevant part is ApiQueryImageInfo.php line 496 and 517-518. I'm not sure it could be improved: PROTO_CANONICAL would return HTTP URLS for a HTTPS CORS AJAX request in some cases even when the repo is HTTPS-capable, PROTO_RELATIVE would return non-absolute URLs which non-browser-based clients might be unprepared for, and PROTO_HTTPS could cause problems for clients with poorly configured HTTPS libraries (there are lots of those).

Krinkle changed the task status from Stalled to Open.Jun 27 2017, 3:00 AM
In T50133#1614931, @Tgr wrote:

The relevant part is ApiQueryImageInfo.php line 496 and 517-518. I'm not sure it could be improved: PROTO_CANONICAL would return HTTP URLS for a HTTPS CORS AJAX request in some cases even when the repo is HTTPS-capable, PROTO_RELATIVE would return non-absolute URLs which non-browser-based clients might be unprepared for, and PROTO_HTTPS could cause problems for clients with poorly configured HTTPS libraries (there are lots of those).

It somewhat reminds me of https://gerrit.wikimedia.org/r/#/c/191533/, where a parameter was being added to specify how to expand URLs. That patch got bogged down in the fact that we had no way to globally specify URL expansion for everything that generates a link during the parse, but this case wouldn't have that problem.