List of steps to reproduce (step by step, including full links if applicable):
- Go to https://commons.wikimedia.org/wiki/File:Noted_Negro_Women_(1893)_HathiTrust_scan.pdf
- Alternatively, go directly to https://upload.wikimedia.org/wikipedia/commons/thumb/c/c6/Noted_Negro_Women_%281893%29_HathiTrust_scan.pdf/page1-463px-Noted_Negro_Women_%281893%29_HathiTrust_scan.pdf.jpg
What happens?:
- In the first case, the thumbnails fail to load properly
- The second case shows that it fails to load with a 429 error specifically
What should have happened instead?:
You should be able to see the actual thumbnails, given that opening the PDF in Firefox or with Evince works fine.
Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc:
There are a bunch of other potentially relevant bugs that share the same theme of "PDFs don't render properly and spit out 429s"; T188885 is the only one I can find that's still open; I'm pretty sure this isn't a PDF/A file, however.
You'll see two revisions of the file, both with the same issue. The first version of the file, when I ran qpdf --check, came up with a warning that the "reported number of objects (11411) is not one plus the highest object number (1160137)". However, after I ran qpdf --linearize in an attempt to fix that issue, the warning went away; the output of that process is the current file revision.
Running gs -sDEVICE=jpeg -dJPEG=90 -sstdout=%stderr -sOutputFile=%stdout -dFirstPage=18 -dLastPage=18 -r150 -dBATCH -dNOPAUSE -dSAFER -q -f"Noted_Negro_Women_(1893)_HathiTrust_scan.pdf" > out.jpg (which as far as I can tell is the command Thumbor runs) on my local machine, with Ghostscript version 9.53.3, successfully produces a JPG with no visible errors. This is taken from the current code, after the fix for T236240.