Page MenuHomePhabricator

Proofread Page extension on Wikisource is displaying wrong pages; purge on Commons file fails
Open, Needs TriagePublic

Description

There was a missing page in this DJVU file, which I was in the process of transcribing on English Wikisource. So I repaired the file (using an image from an alternate scan of the same book) and uploaded the new version at Commons. Note this is a rather large file, 978 pages and 32 MB, which may be related to the problem.
https://commons.wikimedia.org/wiki/File:Portland,_Oregon,_its_History_and_Builders_volume_1.djvu
https://commons.wikimedia.org/w/index.php?title=File:Portland,_Oregon,_its_History_and_Builders_volume_1.djvu&page=618
https://commons.wikimedia.org/w/index.php?title=File:Portland,_Oregon,_its_History_and_Builders_volume_1.djvu&page=619

These are the two pages that are displaying incorrectly. The images that should show up should read "449" and "450" in the header of each page, respectively. But as of this writing (weeks after upload), they reflect "450" and <blank>, which are the previews from the earlier version of the Commons file.
https://en.wikisource.org/wiki/Page:Portland,_Oregon,_its_History_and_Builders_volume_1.djvu/618
https://en.wikisource.org/wiki/Page:Portland,_Oregon,_its_History_and_Builders_volume_1.djvu/619

I have purged all relevant pages at Wikisource, and tried to purge the main file at Commons, but after a minute or two I got an error "purge failed" (see attachment).

I believe the file on Commons needs to be purged in order to fix the problem with the preview in the Proofread Page view.

See relevant discussion here: https://en.wikisource.org/wiki/Wikisource:Scriptorium/Help/Archives/2019#How_to_update_the_display_of_a_DJVU_file?

{F28150036}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 7 2019, 9:51 PM
Restricted Application added a project: Multimedia. · View Herald TranscriptJun 14 2019, 4:56 PM
Xover added a subscriber: Xover.Jun 14 2019, 4:59 PM

This looks to me like MediaWiki-DjVu is choking on the new file and thus never regenerating the thumbnails / page images. Not obvious that ProofreadPage is involved in this at all (except that this is where it becomes most visible). I downloaded the actual .DjVu from Commons and confirmed that the file indeed contains the fixed pages.

Xover updated the task description. (Show Details)Jun 14 2019, 5:03 PM
This comment was removed by ShakespeareFan00.

What @Xover said makes sense -- it's not a problem with ProofreadPage, but a problem made *visible* by ProofreadPage. In that case, my initial title is incorrect, and may be preventing the right people from seeing this. Does anybody know what part of the software is supposed to generate thumbnails? Can somebody retitle this or recategorize it in a way that makes more sense? @Aklapper ?

Aklapper updated the task description. (Show Details)Jul 3 2019, 9:55 PM
Aklapper added a project: Thumbor.

I'm not sure if this is also handled by Thumbor - someone please correct me if I'm wrong.

For the records, I cannot access https://phabricator.wikimedia.org/F28150036 : "You do not have permission to view this object. @Peteforsyth can take this action."

Thank you @Aklapper ! I did not realize it was possible to create private image files here. I believe I have now updated the image to allow all to view. Please let me know if you still have problems (and I will watch out for this in the future).

@Peteforsyth: Thanks, I can see it now :)
And yes, image file permissions are a bit complicated, see https://www.mediawiki.org/wiki/Phabricator/Help#Uploading_file_attachments

Ankry added a subscriber: Ankry.Jul 4 2019, 8:16 PM

@Peteforsyth, I quite often encounter this problem while uploading a new version of a multi-page file with thousands of thumbnails (generally files with >500 pages). See eg T206190 or T214759 .
I know 3 workarounds to deal with this problem in Wikisource (when this already happen):

  1. change scan with in index after upload a new version (if they are 1px wider/narrower new thumbnails must be generated
  2. use js to replace thumbnails for specific pages with thumbnails with other names; see example in pl.ws MadiaWiki:Commons.js :

https://pl.wikisource.org/w/index.php?title=MediaWiki:Common.js&diff=prev&oldid=2188790
(useful for few already existing pages)

  1. upload the file locally

From my experience, the outdated thumbnails seem to disapear after few weeks to half a year.

To avoid this problem, I often try to purge the file (few times if it fails) prior to upload a new version. If the purge fails you may be almost 100% sure to encounter the problem.

Gilles added a subscriber: Gilles.

This seems to be an issue with the thumbnail purge. Possibly due to the large amount of thumbnails associated with a file that has hundreds of pages. Thumbor, when requested with new sizes (i.e. when the old cached versions aren't served), renders the expected pages.

I tried purging the file right now with action=purge and get a "maintenance" error, further suggesting a purge problem.