Page MenuHomePhabricator

Maximum execution time of 30 seconds exceeded
Closed, ResolvedPublic5 Estimated Story PointsBUG REPORT

Description

Since the upgrade to 2.2.0, quite a lot of books are failing with this:

Fatal Error: Maximum execution time of 30 seconds exceeded

exception: 	
{
    "class": "Symfony\\Component\\ErrorHandler\\Error\\FatalError",
    "message": "Error: Maximum execution time of 30 seconds exceeded",
    "code": 0,
    "file": "/var/www/tool/src/PageParser.php:295"
}

http://wsexport.wmflabs.org/tool/book.php?format=epub&lang=it&page=Divina_Commedia

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald Transcript
Samwilson changed the subtype of this task from "Task" to "Bug Report".

@Samwilson Estimate for this ticket? Thanks in advance.

Samwilson set the point value for this task to 5.Jan 21 2021, 12:31 AM

PR: https://github.com/wsexport/tool/pull/316

Still needs a bit of work though.

Patch is merged and released to prod in 2.2.2.

We're still getting quite a few of these timeouts, but from what I've looked at it seems that they might be related to really large books, such as https://he.wikisource.org/w/index.php?title=%D7%A2%D7%A5_%D7%97%D7%99%D7%99%D7%9D&action=info (506 subpages).

I wonder if we should look at doing T222690 as a sort-of fix.

Patch is merged and released to prod in 2.2.2.

We're still getting quite a few of these timeouts, but from what I've looked at it seems that they might be related to really large books, such as https://he.wikisource.org/w/index.php?title=%D7%A2%D7%A5_%D7%97%D7%99%D7%99%D7%9D&action=info (506 subpages).

I wonder if we should look at doing T222690 as a sort-of fix.

I think T222690 would be valuable regardless since we know for a fact that some books are just not gonna be exported. We could also increase the time limit to be a little larger.
I'm a bit confused about why some books take a lot longer to generate (>30sec) but they don't time out. Is it because symfony process changes the max_execution_time?

I think maybe it's that the time spent in external processes (like Calibre) doesn't count towards the PHP execution time, but the time spent parsing and building epubs does.

Let's increase max_execution_time to 60 seconds.

Let's increase max_execution_time to 60 seconds.

@Samwilson We still seem to be getting Maximum execution time of 30 seconds exceeded errors. Should this be 60 instead?

Oops, I forgot to actually do it. Done now (on both test and prod).

The ebook from the description now exports successfully on production.

I took a sample of 11 ebooks which have shown this error on production over the last week. 4 of them exported successfully on production. 5 failed with Maximum execution time of 60 seconds exceeded. The other 2 failed with other errors.

@ifried I am not sure what to do with this. I don't know if there is anything more we can do.

I think T222690 would be valuable regardless since we know for a fact that some books are just not gonna be exported...

Some of the successful ebooks:

Some of the ebooks still failing with Maximum execution time of 60 seconds exceeded error:

Test environment https://ws-export.wmcloud.org/ WS Export version version 2.3.0.

We're going to look into optimising the ID de-duplication process. It's currently looking at all the Parsoid-style IDs (they look something like ^mw[A-Z]+) as well as the actually useful ones, and we can probably do better.

  1. With https://de.wikisource.org/wiki/Die_Gartenlaube the issue is that there's no ws-summary element on that page, and so it falls back to trying to figure out all subpages. Unfortunately, the system for finding subpages was checking for any link that contains the page title (e.g. .*Foo.*, rather than anything that has it as a top level (e.g. ^Foo/.*; this meant it matched on far too much.

    Fix for this in this patch: https://github.com/wikimedia/ws-export/pull/333
  1. https://fr.wikisource.org/wiki/La_Chanson_des_quatre_fils_Aymon is because of this page which is nearly 4 MB and contains over 4000 occurrences of the ID memo, all of which have to be modified to be unique. This comes from {{NumVers}}, which I suspect shouldn't be using an ID at all. I've left a note on the French Scriptorium.

    Perhaps we should show an error if more than say 1000 IDs are found in any given page? Although I'm not sure how common this is.
  1. https://fr.wikisource.org/wiki/La_Chanson_des_quatre_fils_Aymon is because of this page which is nearly 4 MB and contains over 4000 occurrences of the ID memo, all of which have to be modified to be unique. This comes from {{NumVers}}, which I suspect shouldn't be using an ID at all. I've left a note on the French Scriptorium.

This looks to have been fixed now.

MusikAnimal subscribed.
In T272119#6842895, Samwilson wrote:
  1. With https://de.wikisource.org/wiki/Die_Gartenlaube the issue is that there's no ws-summary element on that page, and so it falls back to trying to figure out all subpages. Unfortunately, the system for finding subpages was checking for any link that contains the page title (e.g. .*Foo.*, rather than anything that has it as a top level (e.g. ^Foo/.*; this meant it matched on far too much.

    Fix for this in this patch: https://github.com/wikimedia/ws-export/pull/333

This part has been merged.

I took a sample of 11 ebooks...5 failed with Maximum execution time of 60 seconds exceeded...

I retested this 5: 2 succeeded, 3 failed with other errors. No book failed with Maximum execution time of 60 seconds exceeded.

In total, I tested 12 ebooks which had previously shown this error. 6 succeeded, 6 failed with other errors (mostly ...exceeded the timeout of 120 seconds. T250614).

Test environment: https://ws-export-test.wmcloud.org version 2.4.0-1-g331cf87.