Page MenuHomePhabricator

Investigate error: request entity too large
Closed, ResolvedPublic

Description

From the production logs,

Error: request entity too large
at readStream (/srv/deployment/parsoid/deploy/node_modules/body-parser/node_modules/raw-body/index.js:179:15)
at getRawBody (/srv/deployment/parsoid/deploy/node_modules/body-parser/node_modules/raw-body/index.js:97:12)
at read (/srv/deployment/parsoid/deploy/node_modules/body-parser/lib/read.js:68:3)
at jsonParser (/srv/deployment/parsoid/deploy/node_modules/body-parser/lib/types/json.js:121:5)
at Layer.handle [as handle_request] (/srv/deployment/parsoid/deploy/node_modules/express/lib/router/layer.js:95:5)
at trim_prefix (/srv/deployment/parsoid/deploy/node_modules/express/lib/router/index.js:312:13)
at /srv/deployment/parsoid/deploy/node_modules/express/lib/router/index.js:280:7
at Function.process_params (/srv/deployment/parsoid/deploy/node_modules/express/lib/router/index.js:330:12)
at next (/srv/deployment/parsoid/deploy/node_modules/express/lib/router/index.js:271:10)
at compression (/srv/deployment/parsoid/deploy/node_modules/compression/index.js:205:5)

Should be investigated.

Event Timeline

Arlolra raised the priority of this task from to Medium.
Arlolra updated the task description. (Show Details)
Arlolra added projects: Parsoid, Parsoid-Web-API.
Arlolra subscribed.

enwiki:2015 in sports is a good page to reproduce this.

I saw a 413 in rt-testing and I can reproduce this error now locally.

Length of items in the bundle that's posted in for that page in rt,

604385  // old wt
6404170  // old html
7142749  // new html
1840814  // old dp

Give or take.

15728640  // limit
16790231  // posting

These totals exceed the 15M limit we've set for ParsoidConfig.prototype.maxFormSize.

So this is working as expected. We can bump it up if it seems reasonable?

Change 252607 had a related patch set uploaded (by Arlolra):
T115327: Log errors passed along in express

https://gerrit.wikimedia.org/r/252607

Length of items in the bundle that's posted in for that page in rt,

604385  // old wt
6404170  // old html
7142749  // new html

This is a tangent .. but what causes the new html to bloat so much (given that rt-selser-testing just adds a trailing comment)? This won't necessarily solve the problem.

1840814 // old dp

Holy crap .. that is almost 30% of the HTML size. I thought we pared down the size of data-parsoid earlier this year. Worth a separate investigation at a later point.

Give or take.

15728640 limit
16790231
posting

These totals exceed the 15M limit we've set for `ParsoidConfig.prototype.maxFormSize`.

So this is working as expected. We can bump it up if it seems reasonable?

Let us see what kind of error rates we are getting with the other patch you submitted and revisit.

This is a tangent .. but what causes the new html to bloat so much (given that rt-selser-testing just adds a trailing comment)? This won't necessarily solve the problem.

Hmm.

Holy crap .. that is almost 30% of the HTML size. I thought we pared down the size of data-parsoid earlier this year. Worth a separate investigation at a later point.

I'm sure offsets are adding quite a bit.

Let us see what kind of error rates we are getting with the other patch you submitted and revisit.

You can already see 413s here: https://grafana.wikimedia.org/dashboard/db/parsoid-http-status-codes
which look to be pretty non-existent in production.

Change 252607 merged by jenkins-bot:
T115327: Log errors passed along in express

https://gerrit.wikimedia.org/r/252607

This is a tangent .. but what causes the new html to bloat so much (given that rt-selser-testing just adds a trailing comment)? This won't necessarily solve the problem.

The new output length I listed there was from doc.outerHTML, whereas the old one is the output of our xml serializer. There are some encoding difference, like " vs ". In a large page like that, it adds up.

Anyways, it looks like if we want that page to work in rt, we can always bump maxFormSize there.