Page MenuHomePhabricator

thumbor1003 behaves differently than other thumbor hosts
Closed, ResolvedPublic

Description

It gets a lot more process restarts than other hosts, what looks like twice the load and CPU usage, spiky IOPS. I wonder if something is up with its hardware.

Event Timeline

For some reason the MemoryLimit=15% change from https://gerrit.wikimedia.org/r/#/c/367373/ doesn't seem to be applied on thumbor1003 and that causes io spikes and additional latency

I suspected it was something like that :) Should be an easy fix, then!

Change 377264 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] thumbor: use memorysize_mb fact for unit MemoryLimit

https://gerrit.wikimedia.org/r/377264

Change 377264 merged by Filippo Giunchedi:
[operations/puppet@production] thumbor: use memorysize_mb fact for unit MemoryLimit

https://gerrit.wikimedia.org/r/377264

Indeed, the latency now is the same across all hosts and I've deployed a fix for MemoryLimit to actually DTRT with jessie's systemd, resolving.