Can't load VE on Beta cluster for any pages, showing error 500
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Decrease beta cluster concurrency to 1. | mediawiki/services/change-propagation/deploy | master | +1 -0 |
Related Objects
Event Timeline
Seems like Parsoid in beta cluster always times out, because the MW API is not configured properly - both mwApiServer and defaultAPIProxyURI are empty, however I'm not sure that's indeed incorrect - parsoid in beta is configured quite differently from parsoid in production
The last Parsoid deploy to beta was Wed or Thursday of last week. Assuming this problem started recently it's probably not a code or configuration change on the Parsoid end...
The last Parsoid deploy to beta was Wed or Thursday of last week.
According to the logs, it was on Wednesday.
After restarting Parsoid in beta as an attempt to switch on trace logging it actually started to return the results for test pages,
As in T198421#4326131, I think this is a case where the workers are being overwhelmed. A 503 would indicate we're hitting maxConcurrentCalls
From https://github.com/wikimedia/parsoid/blob/master/lib/api/apiUtils.js#L179
The default is 5x the number of workers (3), which doesn't add up to much
https://github.com/wikimedia/parsoid/blob/master/lib/config/ParsoidConfig.js#L57
Looking at the request logs around the time this was filed, I see
{"name":"parsoid","hostname":"deployment-parsoid09","pid":24,"level":30,"logType":"info","wiki":"enwiki","title":"Template:Other_people5","oldId":51305,"reqId":"831c03bc-11db-4c1f-86cc-296ca3a0c95c","userAgent":"ChangePropagation/WMF","msg":"completed wt2html in 650ms","longMsg":"completed wt2html in 650ms","levelPath":"info","time":"2018-09-17T23:12:07.442Z","v":0}
and a lot of other reparses of templates.
There was this change to Template:Documentation, which probably set off ChangeProp
https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Template%3ADocumentation&type=revision&diff=384325&oldid=350068
That flood has ended and Parsoid is responsive again.
Change 461181 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/services/change-propagation/deploy@master] Decrease beta cluster concurrency to 1.
Change 461181 merged by Ppchelko:
[mediawiki/services/change-propagation/deploy@master] Decrease beta cluster concurrency to 1.
VE works properly in Beta now. After decreasing change-prop concurrency level I've made some edits to test templates I have there transcluded in a fair number of pages, and, as expected, actual re-renders got paced, and VE continued to work throughout the experiment.
I consider this done and resolving, please reopen if happens again.