Page MenuHomePhabricator

Quarry server errors caused by Cloud VPS shared proxy failures
Closed, ResolvedPublic

Description

https://quarry.wmflabs.org/query/18892

Clicked the "Sumbit Query" button and nothing happened, not even an error message.

Event Timeline

This is caused by a disk full on the proxy. Working on it.

Bstorm triaged this task as High priority.

Verified the queries in Quarry work again after clearing logs. Access logs are filling due to a some OSM tiles traffic.

bd808 renamed this task from User interface or query "jammed" on Quarry.... to Quarry server erros caused by Cloud VPS shared proxy failures.Mar 20 2018, 10:19 PM
bd808 renamed this task from Quarry server erros caused by Cloud VPS shared proxy failures to Quarry server errors caused by Cloud VPS shared proxy failures.

Change 421070 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] cloud novaproxy: Set a custom logrotate config for nginx

https://gerrit.wikimedia.org/r/421070

Change 421070 merged by Bstorm:
[operations/puppet@production] cloud novaproxy: Set a custom logrotate config for nginx

https://gerrit.wikimedia.org/r/421070

Change 421321 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] dynamicproxy: Restrict log size to 2GB

https://gerrit.wikimedia.org/r/421321

Change 421321 merged by Bstorm:
[operations/puppet@production] dynamicproxy: Restrict log size to 2GB

https://gerrit.wikimedia.org/r/421321

So, I'm still manually truncating this log because our logrotate cron runs from cron.daily :-p
I plan to puppetize a move of that cron file to cron.hourly so it will actually bother checking the size of the file in time. I don't see a way to do that via our module, so It'll be more like slapping the file in there and removing it from the original spot (or just using crontab).

Change 422197 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] dynamicproxy: run logrotate hourly

https://gerrit.wikimedia.org/r/422197

Change 422197 merged by Bstorm:
[operations/puppet@production] dynamicproxy: run logrotate hourly

https://gerrit.wikimedia.org/r/422197

Ok, I've confirmed that the logs are rotated either daily or on the hour if they are above 2GB in size. That kept the main proxy healthy without manual attention yesterday. This should be good now.