Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Declined | • Gilles | T148962 Record OOM kills as a metric with mtail | |||
Declined | fgiunchedi | T149980 Investigate why oom_kill mtail program doesn't work properly |
Event Timeline
Comment Actions
Some analysis from syslog on thumbor machine for oom-kills
thumbor1001# zgrep -h 'thumbor invoked oom-killer' syslog.7.gz syslog.6.gz syslog.5.gz syslog.4.gz syslog.3.gz syslog.2.gz syslog.1 syslog | uniq -w6 -c | less -FSRX 117 Nov 3 06:59:03 thumbor1001 kernel: [1098077.719957] thumbor invoked oom-killer: 126 Nov 4 00:12:34 thumbor1001 kernel: [1160089.127066] thumbor invoked oom-killer: 278 Nov 5 00:01:16 thumbor1001 kernel: [1245811.288420] thumbor invoked oom-killer: 357 Nov 6 00:15:37 thumbor1001 kernel: [1333072.682151] thumbor invoked oom-killer: 130 Nov 7 00:00:11 thumbor1001 kernel: [1418547.649780] thumbor invoked oom-killer: 214 Nov 8 00:16:16 thumbor1001 kernel: [1505913.256708] thumbor invoked oom-killer: 154 Nov 9 00:04:21 thumbor1001 kernel: [1591598.241859] thumbor invoked oom-killer: 156 Nov 10 00:01:10 thumbor1001 kernel: [1677808.183941] thumbor invoked oom-killer:
thumbor1002# zgrep -h 'thumbor invoked oom-killer' syslog.7.gz syslog.6.gz syslog.5.gz syslog.4.gz syslog.3.gz syslog.2.gz syslog.1 syslog | uniq -w6 -c | less -FSRX 127 Nov 3 06:56:34 thumbor1002 kernel: [1097581.656841] thumbor invoked oom-killer: 121 Nov 4 00:06:51 thumbor1002 kernel: [1159393.526557] thumbor invoked oom-killer: 240 Nov 5 00:05:39 thumbor1002 kernel: [1245714.152180] thumbor invoked oom-killer: 342 Nov 6 00:23:37 thumbor1002 kernel: [1333185.459516] thumbor invoked oom-killer: 129 Nov 7 00:08:29 thumbor1002 kernel: [1418670.596597] thumbor invoked oom-killer: 192 Nov 8 00:15:38 thumbor1002 kernel: [1505492.501601] thumbor invoked oom-killer: 153 Nov 9 00:06:34 thumbor1002 kernel: [1591341.463592] thumbor invoked oom-killer: 151 Nov 10 00:00:13 thumbor1002 kernel: [1677353.633742] thumbor invoked oom-killer:
Comment Actions
So, no change after the IM limits were introduced? Maybe the difference between 900M and 1G isn't enough. I should check how much memory Thumbor consumes when it's idle.
There's still the possibility of significant memory leaks. I guess it would be nice to be able to graph the memory consumption of each Thumbor process over time.