Page MenuHomePhabricator

commons fails to purge large djvu files
Closed, ResolvedPublic


Since a few days trying to purge large djvu file fail, so the text layer of djvu file is not accessible. After a successful purge, creating a page should show the text layer here:,_t4.djvu

Version: 1.16.x
Severity: normal



Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:50 PM
bzimport set Reference to bz21809.

thomasV1 wrote:

The bug can be seen for the following files:,_t4.djvu

the first two of them were uploaded recently, and the djvu text layer has not been successfully extracted,
because of this bug; or maybe it is the text layer extraction that causes the bug.

the last file was uploaded a long time ago, and at that time the file could be purged, so
the djvu text was successfully extracted; it is thus still available in the metadata.

I tested the first file on my machine, with a recent mediawiki install and it worked fine:
the file can be purged and the text layer is correctly extracted.

lars wrote:

When I try to "purge" the large djvu file from Commons, I get an HTTP 500 internal server error response after exactly 30 seconds. Why is purge taking so long? It should just remove old stuff (supposedly a quick operation), and then schedule a queued job for reindexing (a slower operation, depending on the job queue length).

thomasV1 wrote:

Purge takes time because of the djvu text layer extraction.

The bug should be fixed in r61258.

thomasV1 wrote:

Reopening this bug because the fix is not live.

Bryan.TongMinh wrote:

Fix has been deployed, but purging still doesn't work.

Purging work fine now, Bryan, what File: fails to purge for you ?

Bryan.TongMinh wrote:

I got a 403 "Wikimedia has an error" error page trying to purge Presumably it times out, because it takes a long while to load the page.