Page MenuHomePhabricator

Error when using Transkribus engine in 1.3.x
Closed, ResolvedPublic

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Ah, I was getting a timeout using Transkribus as well yesterday, I thought it might be a transient issue

When it crashes, is it quick to crash or does it take some tens of seconds?

It looks like sometimes jobs can take a very long time:

Screenshot 2023-09-12 at 15-33-04 Transkribus.png (534×1 px, 102 KB)

For instance, you can see here a job created at 02:55 PM (about half an hour before this screenshot was taken) is still with status created and is number 1412 in the queue.

The only failures I can see in the OCR Tool's error log are related to T345035, which appears to be a separate issue.

The above job 6158378 is now 1256 in queue (up ~150 in five minutes, at which rate it might be processed about 45 minutes from now; let's see if it is).

I wonder if the error is that at certain times the Transkribus service is overloaded? As far as I can see there's no API for showing the job queue, or else we could make this delay apparent to Wikisource users.

When it crashes, is it quick to crash or does it take some tens of seconds?

On Friday and during the weekend it would take minutes and the server would not respond if I tried to close and reopen the web interface. Since 1.3.1 it takes maybe 10 seconds and then I get a 500 response.

It could of course just be a coincidence that this happened around the time of 1.3.0 release. Maybe a bunch of European users started new projects after the summer holidays and overloaded the app? I have used Transkribus successfully since Wikimania on dozens of pages and it would always reliably trancribe the page within a few seconds.

certain times the Transkribus service is overloaded

Even I believe this is the case, for requests seem to be taking a long time to complete. They're probably just stuck in the queue?

Got this error now

Message:	
Uncaught PHP Exception Symfony\Component\HttpClient\Exception\TransportException: "Idle timeout reached for "https://transkribus.eu/processing/v1/processes"." at /var/www/tool/vendor/symfony/http-client/Chunk/ErrorChunk.php line 65

Time:	
2023-09-12T14:24:18.428545+00:00

Channel:	
request

Context:	
exception:	
{
    "class": "Symfony\\Component\\HttpClient\\Exception\\TransportException",
    "message": "Idle timeout reached for \"https://transkribus.eu/processing/v1/processes\".",
    "code": 0,
    "file": "/var/www/tool/vendor/symfony/http-client/Chunk/ErrorChunk.php:65",
    "trace": [
        "/var/www/tool/vendor/symfony/http-client/Response/AsyncResponse.php:69",
        "/var/www/tool/vendor/symfony/http-client/Response/CommonResponseTrait.php:152",
        "/var/www/tool/vendor/symfony/http-client/Response/AsyncResponse.php:96",
        "/var/www/tool/src/Engine/TranskribusClient.php:174",
        "/var/www/tool/src/Engine/TranskribusClient.php:83",
        "/var/www/tool/src/Engine/TranskribusEngine.php:145",
        "/var/www/tool/src/Controller/OcrController.php:361",
        "/var/www/tool/vendor/symfony/cache/LockRegistry.php:105",
        "/var/www/tool/vendor/symfony/cache/Traits/ContractsTrait.php:88",
        "/var/www/tool/vendor/symfony/cache-contracts/CacheTrait.php:72",
        "/var/www/tool/vendor/symfony/cache/Traits/ContractsTrait.php:95",
        "/var/www/tool/vendor/symfony/cache-contracts/CacheTrait.php:35",
        "/var/www/tool/src/Controller/OcrController.php:363",
        "/var/www/tool/src/Controller/OcrController.php:260",
        "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:163",
        "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:75",
        "/var/www/tool/vendor/symfony/http-kernel/Kernel.php:202",
        "/var/www/tool/public/index.php:21"
    ],
    "previous": {
        "class": "Symfony\\Component\\HttpClient\\Exception\\TransportException",
        "message": "Idle timeout reached for \"https://transkribus.eu/processing/v1/processes\".",
        "code": 0,
        "file": "/var/www/tool/vendor/symfony/http-client/Response/AsyncResponse.php:68",
        "trace": [
            "/var/www/tool/vendor/symfony/http-client/Response/CommonResponseTrait.php:152",
            "/var/www/tool/vendor/symfony/http-client/Response/AsyncResponse.php:96",
            "/var/www/tool/src/Engine/TranskribusClient.php:174",
            "/var/www/tool/src/Engine/TranskribusClient.php:83",
            "/var/www/tool/src/Engine/TranskribusEngine.php:145",
            "/var/www/tool/src/Controller/OcrController.php:361",
            "/var/www/tool/vendor/symfony/cache/LockRegistry.php:105",
            "/var/www/tool/vendor/symfony/cache/Traits/ContractsTrait.php:88",
            "/var/www/tool/vendor/symfony/cache-contracts/CacheTrait.php:72",
            "/var/www/tool/vendor/symfony/cache/Traits/ContractsTrait.php:95",
            "/var/www/tool/vendor/symfony/cache-contracts/CacheTrait.php:35",
            "/var/www/tool/src/Controller/OcrController.php:363",
            "/var/www/tool/src/Controller/OcrController.php:260",
            "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:163",
            "/var/www/tool/vendor/symfony/http-kernel/HttpKernel.php:75",
            "/var/www/tool/vendor/symfony/http-kernel/Kernel.php:202",
            "/var/www/tool/public/index.php:21"
        ]
    }
}

Extra:	

host:	
ocr.wmcloud.org

uri:	
http://ocr.wmcloud.org/api.php?engine=transkribus&image=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2Fthumb%2F7%2F7c%2FLetter_from_Anne_Warren_Weston_to_Sarah_Moore_Grimke_Wednesday_April_4_1838.pdf%2Fpage3-2048px-Letter_from_Anne_Warren_Weston_to_Sarah_Moore_Grimke_Wednesday_April_4_1838.pdf.jpg&langs%5B0%5D=en-b2022&line_id=&uselang=en

I think that basically the job queue is too long at the moment for it to be an interactive process. I've emailed Transkribus to ask if anything's changed. If it's not and this is how it's going to be from now on then we need to change how jobs are submitted and monitored (e.g. make it easier to submit a whole set of pages at once, from the Index page?).

Everything seems to be working smoothly for the last couple of days, so maybe indeed it was a transient issue. Don't know if you heard back from Transkribus, but hopefully these types of interruptions do not happen too often. Feel free to consider this task resolved.

Samwilson claimed this task.

Yep, looks like it's better now. We heard back from them and they said that it was because of another client using lots of resources and that they'd redistribute the load.

We've also now got a jobqueue that's visible to all Wikisource users, so you can keep an eye on that and see when things are slow: https://ocr.wmcloud.org/transkribus