Parsoid load went up significantly this morning in eqiad (the cluster that serves the live traffic), and I traced the issue to the inability of parsoid to parse most revisions of a specific talk page on cebwiki, resulting often in timeouts or the worker dying.
This seems all to be caused by REST-API-Crawler-Google/1.0, which is trying to parse all revisions of the page.
An extract from parsoid logs:
"Timed out processing: cebwiki/Gumagamit:Lsjbot/Kartrutor2?oldid=12301712" "worker 25444 died (1), restarting." "Timed out processing: cebwiki/Gumagamit:Lsjbot/Kartrutor2?oldid=12301924" "worker 25274 died (1), restarting." "Timed out processing: cebwiki/Gumagamit:Lsjbot/Kartrutor2?oldid=12301844" "worker 25434 died (1), restarting." "Timed out processing: cebwiki/Gumagamit:Lsjbot/Kartrutor2?oldid=12301924" "worker 25464 died (1), restarting." "Timed out processing: cebwiki/Gumagamit:Lsjbot/Kartrutor2?oldid=12301963" "worker 25133 died (1), restarting."
The page is long and makes extensive usage of a lua module, https://ceb.wikipedia.org/wiki/Module:KML
While the issue is under control, more or less, in terms of load of the cluster, this is causing workers to die and thus some requests might be in-flight and fail for real users.
This should thus be treated with the highest priority.