Page MenuHomePhabricator

HTTP 429 error on original image requests on Commons (iOS app by default hiding the Referrer header)
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):
This may be a bit of an elaborate repro, but here it goes:

  • clone repo @ https://github.com/nylki/CommonsFinder
  • build and run with XCode on a mac (in an iOS simulator)
  • go to search tab
  • search anything, then open an image
  • click the image in the image detail view to load the fullsize image viewer
  • at this point the original image should be loaded, but is not anymore. (indicated by "resized" badge)
  • check network log -> click on home-tab, click to top-right settings/profile icon -> open "Console"/"Konsole"

What happens?:
Loading the original image file fails reliably with a 429. In a recent test-run to reproduce I did not have many prior network requests that would warrant a 429 I would say (11 prior API requests, 1 thumb image request in 2 seconds).
Thumbnail image requests and API requests do load fine and until a few days ago, loading the original images worked reliably.

Opening the original images in a regular desktop browser works just fine. So it could be some session-based configuration, perhaps some special user-agent handling that changed?

What should have happened instead?:
The original, full-sized image should have been loaded, indicated by the "original" badge in the zoomable viewer.

Other information (browser name/version, screenshots, etc.):
I am the developer of the mentioned iOS app.

Here is a sample of such a failed request (request and response headers) for
https://upload.wikimedia.org/wikipedia/commons/1/18/Berlin_Mitte_June_2023_01.jpg:

Current Request Headers
Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: de-DE,de;q=0.9
User-Agent: CommonsFinder/1 (https://github.com/nylki/CommonsFinder) iOS 26.1.0

Response Headers
Access-Control-Allow-Origin: *
Content-Length: 2111
Content-Type: text/html; charset=utf-8
Date: Mon, 29 Dec 2025 15:34:05 GMT
Server: Varnish
Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0}
report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
retry-after: 1000
server-timing: cache;desc="int-front", host;desc="cp3075"
timing-allow-origin: *
x-cache: cp3075 int
x-cache-status: int-front
x-client-ip: [redacted].242.156
x-request-id: 6f1c37dc-e79f-49f6-a940-c63fb2cbbb35

Event Timeline

Nylki updated the task description. (Show Details)

After writing the ticket, I had the idea to test if it is indeed the User-Agent header that causes the 429 and it appears to make a difference.
I changed it from
CommonsFinder/1 (https://github.com/nylki/CommonsFinder) iOS 26.1.0
to
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:146.0) Gecko/20100101 Firefox/146.0

for testing, and that indeed made it possible to load full images again in the app.

What puzzles me, is that when using curl command using the original User-Agent inside a terminal session, it also works:

curl -v \
        -X GET \
        -H "User-Agent: CommonsFinder/1 (https://github.com/nylki/CommonsFinder) iOS 26.1.0" \
        -H "Accept-Language: de-DE,de;q=0.9" \
        -H "Accept-Encoding: gzip, deflate, br" \
        -H "Accept: */*" \
        "https://upload.wikimedia.org/wikipedia/commons/1/18/Berlin_Mitte_June_2023_01.jpg" > Berlin_Mitte_June_2023_01.jpg

My hunch would be, that it is some heuristic against bots based on multiple factors.

Can you please report the full error message you get in the response body? Thanks

Can you please report the full error message you get in the response body? Thanks

Hi @Joe!
The response body is empty when this happens. So unfortunately no extra error messages to report.

I only get the header and status code.
For good measure here's the raw print of the HTTPURLResponse object (Swift/ObjC) that contains header and status code info, but it's pretty much the same as the nicely formatted log posted before (produced by the network logging library I use):

<NSHTTPURLResponse: 0x600000379e00> { URL: https://upload.wikimedia.org/wikipedia/commons/1/18/Berlin_Mitte_June_2023_01.jpg } { Status Code: 429, Headers {
    "Access-Control-Allow-Origin" =     (
        "*"
    );
    "Content-Length" =     (
        2111
    );
    "Content-Type" =     (
        "text/html; charset=utf-8"
    );
    Date =     (
        "Mon, 29 Dec 2025 16:53:23 GMT"
    );
    Server =     (
        Varnish
    );
    "Strict-Transport-Security" =     (
        "max-age=106384710; includeSubDomains; preload"
    );
    "access-control-expose-headers" =     (
        "Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache"
    );
    nel =     (
        "{ \"report_to\": \"wm_nel\", \"max_age\": 604800, \"failure_fraction\": 0.05, \"success_fraction\": 0.0}"
    );
    "report-to" =     (
        "{ \"group\": \"wm_nel\", \"max_age\": 604800, \"endpoints\": [{ \"url\": \"https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0\" }] }"
    );
    "retry-after" =     (
        1000
    );
    "server-timing" =     (
        "cache;desc=\"int-front\", host;desc=\"cp3075\""
    );
    "timing-allow-origin" =     (
        "*"
    );
    "x-analytics" =     (
        ""
    );
    "x-cache" =     (
        "cp3075 int"
    );
    "x-cache-status" =     (
        "int-front"
    );
    "x-client-ip" =     (
        redacted
    );
    "x-request-id" =     (
        "b09873b7-4248-4b0c-8203-d24b5ad3e5c9"
    );
} }

The response body is empty when this happens

That seems unlikely to me. Maybe your framework is one of those that doesn't handle response bodies when encountering errors. I remember that even 15 years ago this was already an issue with iOS apps and required custom code to able to access the data of the error response.

The response body is empty when this happens

That seems unlikely to me. Maybe your framework is one of those that doesn't handle response bodies when encountering errors. I remember that even 15 years ago this was already an issue with iOS apps and required custom code to able to access the data of the error response.

Hi @TheDJ
I'll see if I can create a minimal example project to easier reproduce and debug the issue.

For @Traffic, there have been quite a few 429 reports over the last two weeks. Reports about WhatsApp and Discord url previews not working, but also other reports on Discord and Commons VP etc. from App builders like CommonsFinder and Commons Gallery.

I originally figured this was simply due to the recent changes, but I've seen so many now that It might be worth checking if the last set of rate changes make sense and aren't accidentally applying too widely.

Apparently people are now bypassing the api limits by switching over to /thumb.php requests, so keep on eye on that as well.

@TheDJ the 429s are probably due to the massive scraping activity SREs have had to deal with over the festivities.

@Nylki I can ensure that when you get a 429 response, the response body isn't empty. In fact, it contains information that allows Wikimedia SREs to determine why you're being blocked.

hi @Joe and @TheDJ,
Thanks for sticking with me! I was indeed wrong about the empty body.
I created a minimal iOS sample app without any 3rd-party libraries to reproduce the issue, this time logging a body.
If you like, you can clone and compile yourself: https://github.com/nylki/CommonsImage429ReproApp

Here is the body (only the lower part of the html included here):

<p>Too many requests - please contact noc@wikimedia.org to discuss a less disruptive approach (3bfa1aa)</p>
</div>
</div>
<div class="footer"><p>If you report this error to the Wikimedia System Administrators, please include the details below.</p><p class="text-muted"><code>Request served via cp3075 cp3075, Varnish XID 551224431<br>Upstream caches: cp3075 int<br>Error: 429, Too many requests - please contact noc@wikimedia.org to discuss a less disruptive approach (3bfa1aa) at Mon, 05 Jan 2026 15:15:19 GMT<br><details><summary>Sensitive client information</summary>IP address: redacted</details></code></p>

It is only doing a single image request, once, which should not result in a 429.
What I observed was, that the very first request was succesfull, returning image data. But on a subsequent start of the app, requests always results in a 429.

Also of interest, as observed initially:

  • thumb-urls do load fine each time
  • using a more common User-Agent (eg. firefox) for debugging does not yield in 429

Hi there. Chiming in as another external tool developer who also noticed the 429 issue. I develop https://commons.gallery/, a website funded by Wikimedia CH that allows Wikimedia Commons users to easily create portfolios in a Flickr-like interface. Here is an example of a typical album (this one was 13 photos).

When a user opens an album, they see smaller thumbnailed (hotlinked) versions of images in a grid. When a user clicks on a thumbnail, a lightbox opens shows a larger version of the image (either another thumbnail or full-resolution, depending on the browser width). This is similar to the on-wiki experience with Wikipedia/MediaViewer.

Since starting development, I've been trying to minimize and rate limit the number of API calls since there are of course a lot of images being displayed each time a user views an album. When a user adds a photo to their album, I cache the image data serverside (dimensions, author and license info, thumbnail URLs, etc.) so I don't have to repeat calls for that data. Only common thumbnail sizes are displayed, so we don't request weird thumbnail sizes that have never been generated before. On the clientside, I also implemented a "progressive loader" that limits the number of concurrent thumbnails that can be loaded at once.

Now, previously, the thumbnails and full-size images we were displaying were upload.wikimedia.org links. As mentioned above, I found that thumb.php serves as a workaround that I recognize is not ideal, but currently works fine without hitting any limits in my current implementation.

Anyway: around the time this ticket was opened, I also started to notice 429 errors clientside when viewing full resolution images (as upload.wikimedia.org links) in an album. Interestingly, the smaller thumbnails with a specific pixel size typically loaded fine. It's the full-size image URLs that would fail with frequency.

For example, 330px thumbnails would typically load fine without issue (despite there being dozens of these on an album page):
https://upload.wikimedia.org/wikipedia/commons/thumb/c/c5/Voidar_Koldbrann.jpg/330px-Voidar_Koldbrann.jpg

But the full size (seen when opening an image in the lightbox on desktop) would be frequently (almost always) 429ed depending on the client/user:
https://upload.wikimedia.org/wikipedia/commons/c/c5/Voidar_Koldbrann.jpg

What's weird here, from my perspective: if I were loading 50 hotlinked thumbnails at once, a 429 on some of those thumbnails wouldn't surprise me. But those tend to load fine. It's the single one-off full resolution images that fail with 429.

To emphasize, the 429s in this case are all happening clientside (so the user agent being sent is the user's own browser user agent).

Example of request headers for a failed image load:

GET /wikipedia/commons/8/8d/Emperor_at_Midgardsblot_2024.jpg HTTP/2
Host: upload.wikimedia.org
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:146.0) Gecko/20100101 Firefox/146.0
Accept: image/avif,image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br, zstd
Connection: keep-alive
Sec-Fetch-Dest: image
Sec-Fetch-Mode: no-cors
Sec-Fetch-Site: cross-site
DNT: 1
Sec-GPC: 1
Priority: u=5, i
Pragma: no-cache
Cache-Control: no-cache
TE: trailers

And the response headers:

HTTP/2 429 
date: Mon, 05 Jan 2026 16:12:22 GMT
server: Varnish
x-cache: cp2034 int
x-cache-status: int-front
server-timing: cache;desc="int-front", host;desc="cp2034"
strict-transport-security: max-age=106384710; includeSubDomains; preload
report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0}
x-client-ip: <redacted>
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
timing-allow-origin: *
retry-after: 1000
content-type: text/html; charset=utf-8
content-length: 2135
x-request-id: 70e5524e-1311-4104-b0cf-505382165cdf
x-analytics: 
X-Firefox-Spdy: h2

Return body:

...
<h1>Error</h1>

<p>Too many requests - please contact noc@wikimedia.org to discuss a less disruptive approach (3bfa1aa)</p>
</div>
</div>
<div class="footer"><p>If you report this error to the Wikimedia System Administrators, please include the details below.</p><p class="text-muted"><code>Request served via cp2034 cp2034, Varnish XID 810352899<br>Upstream caches: cp2034 int<br>Error: 429, Too many requests - please contact noc@wikimedia.org to discuss a less disruptive approach (3bfa1aa) at Mon, 05 Jan 2026 16:46:32 GMT<br><details><summary>Sensitive client information</summary>IP address: redacted</details></code></p>
</div>
</html>

Also noting: so far it seems like I run into 429s on Firefox but not Chrome.

Finally, thanks for everything you all do to keep things running smoothly. I know media is a particular pain point with AI scrapers hammering them.

Did a bit more experimenting:

If I go to https://commons.gallery in my local Firefox and run this in the console:

fetch("https://upload.wikimedia.org/wikipedia/commons/8/8d/Emperor_at_Midgardsblot_2024.jpg")
  .then(r => r.text())
  .then(console.log);

I get a 429 back.

If I do that on any other non-Wikimedia website in the same browser, it works fine.
But if I do that on my local Chrome on commons.gallery, it works fine.

Just speculating of course, but maybe a mix of the referrer (commons.gallery) + my user agent (Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:146.0) Gecko/20100101 Firefox/146.0) setting off alarm bells?

Ah! I resolved the 429 issue on my end. My webapp uses Helmet.js, which defaults to hiding the Referer header when loading external files. I started allowing the Referer header to be sent when loading upload.wikimedia.org images and images are loading fine again (and I'm now back to using upload.wikimedia.org instead of thumb.php). I'm guessing with whatever recent changes were done to keep AI scrapers at bay, the Referer header became a more important heuristic.

Ah! I resolved the 429 issue on my end. My webapp uses Helmet.js, which defaults to hiding the Referrer header when loading external files. I started allowing the Referrer header to be sent when loading upload.wikimedia.org images and images are loading fine again (and I'm now back to using upload.wikimedia.org instead of thumb.php). I'm guessing with whatever recent changes were done to keep AI scrapers at bay, the Referrer header became a more important heuristic.

That also explains why an iOS app like CommonsFinder would run into this.

Aklapper renamed this task from HTTP 429 error on original image requests on Commons since a few days (iOS app) to HTTP 429 error on original image requests on Commons (iOS app by default hiding the Referrer header).Jan 6 2026, 11:01 AM

Ah! I resolved the 429 issue on my end. My webapp uses Helmet.js, which defaults to hiding the Referrer header when loading external files. I started allowing the Referrer header to be sent when loading upload.wikimedia.org images and images are loading fine again (and I'm now back to using upload.wikimedia.org instead of thumb.php). I'm guessing with whatever recent changes were done to keep AI scrapers at bay, the Referrer header became a more important heuristic.

Thanks for experimenting, good observation! Just tested, setting a referer does indeed resolve the issue. :)

Now, my 2 biggest followup questions would be:

  1. Is the required referrer intended or is there still a side-effect/bug. Also, related: what other changes might be necessary to communicate to the backend that a trustworthy client is making the request.
  1. what would be the preferred referer value for mobile apps, when requesting images (or any other request for that matter)? To experiment just used the description page url of the image (eg. "Referer": "https://commons.wikimedia.org/wiki/File:Berlin_Mitte_June_2023_01.jpg"), which worked. Or should it rather be some custom app url scheme (eg.: CommonsFinder://File:Berlin_Mitte_June_2023_01.jpg).

@SuperHamster Btw. I noticed that since ~ yesterday it appears to now work on my setup again without setting a referrer.

It would be interesting to hear from involved teams if there was a new relevant changeset pushed; and also to get some additional info on how best to proceed (see above) :)

EDIT: Hm, not anymore. Not sure if I looked at the wrong environment or if it was indeed a backend change.

I ran into this today, when right clicking to download an original with "Download links file as". Seems that doesn't send a referrer either. Who knew.

It does work when you click the original link and it opens it or downloads it in the browser, but that's less convenient if you already know you want to download instead of (potentially) viewing the file.

I corresponded with the e-mail address listed on the error page, and I received a response from Giuseppe Lavagetto that this was a bug. It was fixed immediately, on 12 January 2026. If this issue can no longer be reproduced, it may have been fixed at that time. I would tag Giuseppe but I do not know their username here. Possibly @Joe

Hi @Jonesey95 !
Do you have a link to the commit or ticket (if it's public)? :)

I am unfortunately still experiencing the described issue on original image files when a referer header is missing (as of 2025-01-19 (yesterday)).

It is not consistently visible though, sometimes all requested images fail to load with 429, sometimes it just works.
Contrary to a bot that usually runs on a single machine, in this case it is an app used on many users devices, the IP addresses of requests for the same User-Agent would be different in requests and vary from day to day. Could this be a factor and somehow explain why the behaviour is inconsistent from day to day?
(eg: on a some days the IPs for the User-Agent stay the same due to less test users, but on others there might be more activity).

However, as we also found out before 2026-01-12, when a referer is present, the issue does not happen or atleast does not surface to the same extent.
Does the fix only work in conjuction with a referer-header present in the http request?

thanks!

I was not sent a link to the commit. I had written an e-mail to noc@wikimedia.org, the address shown in the error message. I had no expectation that I would get anything useful in return, but my problem was acknowledged and fixed right away.

Btw. I released on new version of CommonsFinder that sends referer-headers with all network requests: https://github.com/nylki/CommonsFinder/commit/3c1727a7768b9dc3bf76d4982219f550b649832a

Will observe if the behaviour improves. Would still love to get some more insight into the topic and what the stance is on referers for desktop/mobile apps in general :) If I am not mistaken the wikipedia app does not send referer-headers.