Page MenuHomePhabricator

fifo-log-tailer: evergrowing memory usage
Closed, ResolvedPublic

Description

On various cache_upload nodes, metrics collected by atsmtail@backend.service stopped flowing in. See for example cp1078:

The issue is that fifo-log-tailer allocates memory without ever freeing it, causing atslog-backend to fail as follows after several days of uptime:

Jul 29 23:32:18 cp1078 atsmtail-backend[18673]: fatal error: runtime: out of memory
[...]
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]: runtime.makeslice(0x4f6380, 0x1ffffffe00, 0x1ffffffe00, 0x0, 0xc0000a3da8, 0x46a1b1)
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]:         /usr/lib/go-1.11/src/runtime/slice.go:70 +0x77 fp=0xc0000a3d28 sp=0xc0000a3cf8 pc=0x440437
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]: bytes.makeSlice(0x1ffffffe00, 0x0, 0x0, 0x0)
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]:         /usr/lib/go-1.11/src/bytes/buffer.go:231 +0x6d fp=0xc0000a3d68 sp=0xc0000a3d28 pc=0x4a918d
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]: bytes.(*Buffer).grow(0xc0000f6000, 0x200, 0x1f95)
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]:         /usr/lib/go-1.11/src/bytes/buffer.go:144 +0x16c fp=0xc0000a3db8 sp=0xc0000a3d68 pc=0x4a8c0c
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]: bytes.(*Buffer).ReadFrom(0xc0000f6000, 0x540ae0, 0xc00000c100, 0x608900, 0x0, 0x7f58dc96c000)
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]:         /usr/lib/go-1.11/src/bytes/buffer.go:204 +0x4b fp=0xc0000a3e28 sp=0xc0000a3db8 pc=0x4a8f7b
Jul 29 23:32:19 cp1078 atsmtail-backend[18673]: io/ioutil.readAll(0x540ae0, 0xc00000c100, 0x200, 0x0, 0x0, 0x0, 0x0, 0x0)

I've added pprof to fifo-log-tailer as follows and tried it out on cp4021.

import _ "net/http/pprof"
[...] further down in main()
go func() {
 log.Println(http.ListenAndServe("localhost:6060", nil))
}()

After a few minutes of runtime, pprof showed this:

$ curl -s http://localhost:6060/debug/pprof/heap > prof; go tool pprof -top prof
3.19MB of 3.19MB total (  100%)
Dropped 15 nodes (cum <= 0.02MB)
      flat  flat%   sum%        cum   cum%
    3.19MB   100%   100%     3.19MB   100%  bytes.makeSlice
         0     0%   100%     3.19MB   100%  bytes.(*Buffer).ReadFrom
         0     0%   100%     3.19MB   100%  bytes.(*Buffer).grow
         0     0%   100%     3.19MB   100%  io/ioutil.ReadAll
         0     0%   100%     3.19MB   100%  io/ioutil.readAll
         0     0%   100%     3.19MB   100%  main.main
         0     0%   100%     3.19MB   100%  runtime.main

Later on:

12MB of 12MB total (  100%)
Dropped 22 nodes (cum <= 0.06MB)
      flat  flat%   sum%        cum   cum%
      12MB   100%   100%       12MB   100%  bytes.makeSlice
         0     0%   100%       12MB   100%  bytes.(*Buffer).ReadFrom
         0     0%   100%       12MB   100%  bytes.(*Buffer).grow
         0     0%   100%       12MB   100%  io/ioutil.ReadAll
         0     0%   100%       12MB   100%  io/ioutil.readAll
         0     0%   100%       12MB   100%  main.main
         0     0%   100%       12MB   100%  runtime.main

After some minutes I confirmed that memory usage kept on increasing.

fifo-log-tailer reads from the unix domain socket to which fifo-log-demux writes, and its implementation boils down to:

c, err := net.Dial("unix", *socketPath)
[...]
tee := io.TeeReader(c, io.Writer(os.Stdout))
_, err = ioutil.ReadAll(tee)

It turns out that ioutil.ReadAll uses bytes.Buffer.readFrom(), which appends to a buffer and grows it as needed. It is thus a poor choice for a program that reads from a socket forever.

We need to patch fifo-log-tailer to avoid this behavior.

Event Timeline

ema created this task.Wed, Jul 31, 1:28 PM
Restricted Application added a project: Operations. · View Herald TranscriptWed, Jul 31, 1:28 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Normal priority.Wed, Jul 31, 1:28 PM
ema updated the task description. (Show Details)

Change 526668 had a related patch set uploaded (by Ema; owner: Ema):
[operations/software/fifo-log-demux@master] 0.4: do not use ioutil.ReadAll() in fifo-log-tailer

https://gerrit.wikimedia.org/r/526668

Mentioned in SAL (#wikimedia-operations) [2019-07-31T13:46:37Z] <ema> cp4021: test fifo-log-demux 0.4 T229414

ema added a comment.Wed, Jul 31, 2:51 PM

The new implementation using io.CopyBuffer instead of ioutil.ReadAll has been tested successfully on cp4021. Memory usage looks sane and does not increase forever:

$ while true; do curl -s http://localhost:6060/debug/pprof/heap > prof; go tool pprof -top prof | grep total | ts; sleep 60;  done
Jul 31 14:40:28 1485.59kB of 1485.59kB total (  100%)
Jul 31 14:41:28 1485.59kB of 1485.59kB total (  100%)
Jul 31 14:42:28 1553.21kB of 1553.21kB total (  100%)
Jul 31 14:43:28 902.59kB of 902.59kB total (  100%)
Jul 31 14:44:28 902.59kB of 902.59kB total (  100%)

ioutil.ReadAll implementation for comparison:

Jul 31 14:46:53 1712.56kB of 1712.56kB total (  100%)
Jul 31 14:47:53 3.19MB of 3.19MB total (  100%)
Jul 31 14:48:53 6.04MB of 6.04MB total (  100%)
Jul 31 14:49:53 6.04MB of 6.04MB total (  100%)
Jul 31 14:50:53 12MB of 12MB total (  100%)

Change 526668 merged by Ema:
[operations/software/fifo-log-demux@master] 0.4: do not use ioutil.ReadAll() in fifo-log-tailer

https://gerrit.wikimedia.org/r/526668

Mentioned in SAL (#wikimedia-operations) [2019-07-31T15:15:31Z] <ema> upload fifo-log-demux 0.4 to stretch-wikimedia T229414

Mentioned in SAL (#wikimedia-operations) [2019-07-31T15:22:44Z] <ema> cp-ats: upgrade fifo-log-demux to 0.4 and restart atsmtail@backend.service T229414

ema moved this task from Triage to Caching on the Traffic board.Thu, Aug 1, 7:26 AM
ema added a comment.EditedThu, Aug 1, 9:15 AM

I've been digging a bit further and reproduced this on my workstation with the following ulimit settings and program:

ema@ariel:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 62488
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 95
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 62488
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
// growmem.go
package main

import (
        "io/ioutil"
        "os"
)

func main() {
        ioutil.ReadAll(os.Stdin)
}

The go runtime throws a panic with the string "runtime: out of memory" in src/runtime/mem_linux.go:

func sysMap(v unsafe.Pointer, n uintptr, sysStat *uint64) {
    mSysStatInc(sysStat, n)

    p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0)
    if err == _ENOMEM {
        throw("runtime: out of memory")
    }
    if p != v || err != 0 {
        throw("runtime: cannot map pages in arena address space")
    }
}

Searching for such function among SystemTap's probe points:

ema@ariel:~$ stap -L 'process("growmem").function("*sysMap*")'
process("/home/ema/growmem").function("runtime.sysMap@/usr/lib/go-1.12/src/runtime/mem_linux.go:165") $v:void* $n:uintptr $sysStat:uint64*

After starting the test program with yes | ./growmem and tracing calls to sysMap to print the value of n:

ema@ariel:~$ stap -ve 'probe process("growmem").function("runtime.sysMap") { printf("size=%d\n", $n) }'
Pass 1: parsed user script and 480 library scripts using 101672virt/87980res/7332shr/80632data kb, in 180usr/30sys/211real ms.
Pass 2: analyzed script: 1 probe, 1 function, 0 embeds, 0 globals using 107528virt/94928res/8384shr/86488data kb, in 30usr/0sys/29real ms.
Pass 3: translated to C into "/tmp/stap7qbPco/stap_b83a7f23dca772de5e7f05ae504efa3d_1231_src.c" using 107528virt/95120res/8576shr/86488data kb, in 20usr/100sys/549real ms.
Pass 4: compiled C into "stap_b83a7f23dca772de5e7f05ae504efa3d_1231.ko" in 2720usr/780sys/3461real ms.
Pass 5: starting run.
size=67108864
size=67108864
size=67108864
size=134217728
size=268435456
size=536870912
size=1073741824
size=2147483648
size=4294967296
size=8589934592
size=17179869184

At which point growmem panics (I've got 16G of memory on my system):

ema@ariel:~$ yes | ./growmem                                                                                                                                                                                        
fatal error: runtime: out of memory                                                                                                                                                                                 

runtime stack:                                                                                                                                                      
runtime.throw(0x4a6d29, 0x16)                                                                                                                                                                                       
        /usr/lib/go-1.12/src/runtime/panic.go:617 +0x72                                                                                                                                                             
runtime.sysMap(0xc404000000, 0x400000000, 0x550278)                                                                                                                                                                 
        /usr/lib/go-1.12/src/runtime/mem_linux.go:170 +0xc7                                                                                                                                                         
runtime.(*mheap).sysAlloc(0x538be0, 0x400000000, 0x538bf0, 0x200000)                                                                                                                                                
        /usr/lib/go-1.12/src/runtime/malloc.go:633 +0x1cd                                                                                                                                                           
runtime.(*mheap).grow(0x538be0, 0x200000, 0x0)                                                                                                                                                                      
        /usr/lib/go-1.12/src/runtime/mheap.go:1222 +0x42                                                                  
[...]

bytes.(*Buffer).grow, which is what ioutil.ReadAll uses, does indeed allocate double the previously allocated space when needed, explaining why the sizes printed above double:

// Not enough space anywhere, we need to allocate.
buf := makeSlice(2*c + n)

This explains how the program paniced due to the go runtime before the OOM killer had to begin shooting processes around.

ema closed this task as Resolved.Thu, Aug 1, 2:41 PM

The new fifo-log-tailer has now been running for one day and shows reasonable memory usage:

14:39:53 ema@cp1080.eqiad.wmnet:~
$ ps u -q `pidof fifo-log-tailer` 
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     212791  0.1  0.0 216104  3076 ?        Sl   Jul31   2:19 fifo-log-tailer -socket /var/run/trafficserver/notpurge.sock -regexp

Closing.