Current working theory
this seems to be related to jemalloc and THP usage and looks quite similar to https://github.com/varnishcache/varnish-cache/issues/3511
We are running the following experiments:
host | THP | jemalloc version | used memory |
cp4045 | disabled | 5.1.0 (buster) | 346.767 GB |
cp4046 | disabled | 5.1.0 (buster) | 351.353 GB |
cp4047 | disabled | 5.2.1 (backported from bullseye) | 339.32 GB |
cp4048 | disabled | 5.2.1 (backported from bullseye) | 341.492 GB |
cp4049 | enabled | 5.2.1 (backported from bullseye) | 393.756 GB |
cp4050 | enabled | 5.2.1 (backported from bullseye) | 442.626 GB |
cp4051 | enabled | 5.1.0 (unmodified host) | 417.07 GB |
cp4052 | enabled | 5.1.0 (unmodified host) | N/A |
Nov 10 23:05:17 cp4047 kernel: [798350.776001] Mem-Info: Nov 10 23:05:17 cp4047 kernel: [798350.776011] active_anon:1747161 inactive_anon:128275203 isolated_anon:0 Nov 10 23:05:17 cp4047 kernel: [798350.776011] active_file:141 inactive_file:385 isolated_file:2 Nov 10 23:05:17 cp4047 kernel: [798350.776011] unevictable:27 dirty:0 writeback:0 Nov 10 23:05:17 cp4047 kernel: [798350.776011] slab_reclaimable:34705 slab_unreclaimable:157087 Nov 10 23:05:17 cp4047 kernel: [798350.776011] mapped:79836 shmem:284420 pagetables:515480 bounce:0 Nov 10 23:05:17 cp4047 kernel: [798350.776011] free:781027 free_pcp:1846 free_cma:0 Nov 10 23:05:17 cp4047 kernel: [798350.776014] Node 0 active_anon:1803668kB inactive_anon:257732840kB active_file:0kB inactive_file:404kB unevictable:24kB isolated(anon):0kB isolated(file):0kB mapped:30528kB dirty:0kB writeback:0kB shmem:134364kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 19023872kB writeback_tmp:0kB kernel_stack:20960kB all_unreclaimable? no Nov 10 23:05:17 cp4047 kernel: [798350.776017] Node 1 active_anon:5184976kB inactive_anon:255367972kB active_file:736kB inactive_file:1136kB unevictable:84kB isolated(anon):0kB isolated(file):8kB mapped:288816kB dirty:0kB writeback:0kB shmem:1003316kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 23343104kB writeback_tmp:0kB kernel_stack:21136kB all_unreclaimable? no Nov 10 23:05:17 cp4047 kernel: [798350.776019] Node 0 DMA free:15896kB min:60kB low:72kB high:84kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15980kB managed:15896kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Nov 10 23:05:17 cp4047 kernel: [798350.776023] lowmem_reserve[]: 0 1388 257351 257351 257351 Nov 10 23:05:17 cp4047 kernel: [798350.776028] Node 0 DMA32 free:1029468kB min:5784kB low:7240kB high:8696kB reserved_highatomic:0KB active_anon:3376kB inactive_anon:420640kB active_file:24kB inactive_file:0kB unevictable:0kB writepending:0kB present:1521600kB managed:1456064kB mlocked:0kB pagetables:1112kB bounce:0kB free_pcp:504kB local_pcp:0kB free_cma:0kB Nov 10 23:05:17 cp4047 kernel: [798350.776031] lowmem_reserve[]: 0 0 255962 255962 255962 Nov 10 23:05:17 cp4047 kernel: [798350.776035] Node 0 Normal free:1028416kB min:1041448kB low:1303560kB high:1565672kB reserved_highatomic:2048KB active_anon:1800292kB inactive_anon:257312200kB active_file:0kB inactive_file:408kB unevictable:24kB writepending:0kB present:266338304kB managed:262112564kB mlocked:24kB pagetables:556408kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Nov 10 23:05:17 cp4047 kernel: [798350.776039] lowmem_reserve[]: 0 0 0 0 0 Nov 10 23:05:17 cp4047 kernel: [798350.776043] Node 1 Normal free:1050328kB min:1049852kB low:1314076kB high:1578300kB reserved_highatomic:0KB active_anon:5184976kB inactive_anon:255367972kB active_file:736kB inactive_file:1136kB unevictable:84kB writepending:0kB present:268435456kB managed:264226884kB mlocked:84kB pagetables:1504400kB bounce:0kB free_pcp:6880kB local_pcp:0kB free_cma:0kB Nov 10 23:05:17 cp4047 kernel: [798350.776046] lowmem_reserve[]: 0 0 0 0 0 Nov 10 23:05:17 cp4047 kernel: [798350.776049] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15896kB Nov 10 23:05:17 cp4047 kernel: [798350.776062] Node 0 DMA32: 204*4kB (U) 2914*8kB (UM) 5525*16kB (UM) 3885*32kB (UME) 2523*64kB (UME) 1315*128kB (UME) 648*256kB (UME) 302*512kB (UME) 139*1024kB (UME) 0*2048kB 0*4096kB = 1029488kB Nov 10 23:05:17 cp4047 kernel: [798350.776074] Node 0 Normal: 5148*4kB (ME) 47786*8kB (UMEH) 22964*16kB (UMEH) 3538*32kB (UMEH) 1192*64kB (UMEH) 240*128kB (UME) 70*256kB (UME) 27*512kB (UME) 6*1024kB (U) 0*2048kB 0*4096kB = 1028416kB Nov 10 23:05:17 cp4047 kernel: [798350.776086] Node 1 Normal: 25231*4kB (UE) 7221*8kB (UME) 2304*16kB (UME) 1387*32kB (UME) 519*64kB (UME) 208*128kB (UME) 80*256kB (UME) 21*512kB (UME) 139*1024kB (UM) 27*2048kB (ME) 127*4096kB (M) = 1048836kB Nov 10 23:05:17 cp4047 kernel: [798350.776100] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Nov 10 23:05:17 cp4047 kernel: [798350.776102] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Nov 10 23:05:17 cp4047 kernel: [798350.776102] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Nov 10 23:05:17 cp4047 kernel: [798350.776103] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Nov 10 23:05:17 cp4047 kernel: [798350.776104] 285095 total pagecache pages Nov 10 23:05:17 cp4047 kernel: [798350.776105] 0 pages in swap cache Nov 10 23:05:17 cp4047 kernel: [798350.776107] Swap cache stats: add 0, delete 0, find 0/0 Nov 10 23:05:17 cp4047 kernel: [798350.776107] Free swap = 0kB Nov 10 23:05:17 cp4047 kernel: [798350.776108] Total swap = 0kB Nov 10 23:05:17 cp4047 kernel: [798350.776109] 134077835 pages RAM Nov 10 23:05:17 cp4047 kernel: [798350.776109] 0 pages HighMem/MovableOnly Nov 10 23:05:17 cp4047 kernel: [798350.776110] 2124983 pages reserved Nov 10 23:05:17 cp4047 kernel: [798350.776111] 0 pages hwpoisoned Nov 10 23:05:17 cp4047 kernel: [798350.776111] Tasks state (memory values in pages): Nov 10 23:05:17 cp4047 kernel: [798350.776112] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name Nov 10 23:05:17 cp4047 kernel: [798350.776145] [ 1071] 0 1071 80287 54869 663552 0 0 systemd-journal Nov 10 23:05:17 cp4047 kernel: [798350.776147] [ 1090] 0 1090 5813 484 73728 0 -1000 systemd-udevd Nov 10 23:05:17 cp4047 kernel: [798350.776150] [ 1442] 0 1442 771 54 45056 0 0 mdadm Nov 10 23:05:17 cp4047 kernel: [798350.776152] [ 1458] 101 1458 23275 214 81920 0 0 systemd-timesyn Nov 10 23:05:17 cp4047 kernel: [798350.776155] [ 1631] 0 1631 4905 288 73728 0 0 systemd-logind Nov 10 23:05:17 cp4047 kernel: [798350.776156] [ 1641] 109 1641 43439 2648 110592 0 0 python3 Nov 10 23:05:17 cp4047 kernel: [798350.776158] [ 1648] 0 1648 1359 61 49152 0 0 rasdaemon Nov 10 23:05:17 cp4047 kernel: [798350.776161] [ 1652] 105 1652 2313 203 57344 0 -900 dbus-daemon Nov 10 23:05:17 cp4047 kernel: [798350.776163] [ 1653] 0 1653 2249075 32665 1384448 0 0 cadvisor Nov 10 23:05:17 cp4047 kernel: [798350.776165] [ 1655] 0 1655 2126 65 57344 0 0 cron Nov 10 23:05:17 cp4047 kernel: [798350.776167] [ 1664] 106 1664 7054 273 94208 0 0 lldpd Nov 10 23:05:17 cp4047 kernel: [798350.776168] [ 1665] 0 1665 61504 855 98304 0 0 traffic_manager Nov 10 23:05:17 cp4047 kernel: [798350.776170] [ 1666] 0 1666 2524 365 57344 0 0 smartd Nov 10 23:05:17 cp4047 kernel: [798350.776172] [ 1679] 106 1679 7055 296 90112 0 0 lldpd Nov 10 23:05:17 cp4047 kernel: [798350.776174] [ 1680] 109 1680 880477 8034 479232 0 0 prometheus-ipmi Nov 10 23:05:17 cp4047 kernel: [798350.776176] [ 1685] 109 1685 1710277 17126 946176 0 0 prometheus-node Nov 10 23:05:17 cp4047 kernel: [798350.776181] [ 1697] 0 1697 1287574 12020 745472 0 0 confd Nov 10 23:05:17 cp4047 kernel: [798350.776184] [ 1717] 116 1717 1169878 6535 606208 0 0 fifo-log-demux Nov 10 23:05:17 cp4047 kernel: [798350.776186] [ 1741] 0 1741 21243 322 65536 0 0 ipmiseld Nov 10 23:05:17 cp4047 kernel: [798350.776188] [ 1743] 0 1743 464764 4526 319488 0 0 rsyslogd Nov 10 23:05:17 cp4047 kernel: [798350.776190] [ 1744] 0 1744 4706 1638 65536 0 0 varnishd Nov 10 23:05:17 cp4047 kernel: [798350.776191] [ 1750] 0 1750 1523416 11577 815104 0 0 mtail Nov 10 23:05:17 cp4047 kernel: [798350.776208] [ 1785] 0 1785 3454 217 61440 0 -1000 sshd Nov 10 23:05:17 cp4047 kernel: [798350.776210] [ 1789] 0 1789 1693 159 53248 0 -500 nrpe Nov 10 23:05:17 cp4047 kernel: [798350.776213] [ 1885] 0 1885 1403 28 49152 0 0 agetty Nov 10 23:05:17 cp4047 kernel: [798350.776215] [ 1905] 0 1905 1346 29 49152 0 0 agetty Nov 10 23:05:17 cp4047 kernel: [798350.776217] [ 1973] 116 1973 7811978 5313115 43347968 0 0 [TS_MAIN] Nov 10 23:05:17 cp4047 kernel: [798350.776227] [ 2031] 108 2031 5078 210 77824 0 0 exim4 Nov 10 23:05:17 cp4047 kernel: [798350.776229] [ 2052] 0 2052 71496 40370 409600 0 0 haproxy Nov 10 23:05:17 cp4047 kernel: [798350.776236] [ 2215] 116 2215 3014 234 61440 0 0 traffic_crashlo Nov 10 23:05:17 cp4047 kernel: [798350.776241] [ 2421] 112 2421 320525176 124017210 2050834432 0 0 cache-main Nov 10 23:05:17 cp4047 kernel: [798350.776244] [ 2955] 111 2955 1153736 7288 622592 0 0 prometheus-varn Nov 10 23:05:17 cp4047 kernel: [798350.776246] [ 2956] 0 2956 4571 1552 73728 0 0 python3 Nov 10 23:05:17 cp4047 kernel: [798350.776247] [ 2957] 0 2957 4539 1512 73728 0 0 python3 Nov 10 23:05:17 cp4047 kernel: [798350.776251] [ 2959] 0 2959 227372 23776 327680 0 0 varnishkafka Nov 10 23:05:17 cp4047 kernel: [798350.776253] [ 2964] 0 2964 1682 57 49152 0 0 varnishmtail-de Nov 10 23:05:17 cp4047 kernel: [798350.776256] [ 2971] 0 2971 1682 65 49152 0 0 varnishmtail-wr Nov 10 23:05:17 cp4047 kernel: [798350.776257] [ 2973] 0 2973 22074 20971 212992 0 0 varnishncsa Nov 10 23:05:17 cp4047 kernel: [798350.776260] [ 2975] 0 2975 1707362 11795 892928 0 0 mtail Nov 10 23:05:17 cp4047 kernel: [798350.776263] [ 3014] 0 3014 22303 21206 217088 0 0 varnishlog Nov 10 23:05:17 cp4047 kernel: [798350.776264] [ 3015] 0 3015 22235 21138 221184 0 0 varnishlog Nov 10 23:05:17 cp4047 kernel: [798350.776267] [ 5431] 64368 5431 2962777 150470 2551808 0 0 purged Nov 10 23:05:17 cp4047 kernel: [798350.776272] [ 32789] 0 32789 1682 57 49152 0 0 varnishmtail-in Nov 10 23:05:17 cp4047 kernel: [798350.776274] [ 32790] 0 32790 1682 68 45056 0 0 varnishmtail-wr Nov 10 23:05:17 cp4047 kernel: [798350.776275] [ 32791] 0 32791 21944 20841 212992 0 0 varnishncsa Nov 10 23:05:17 cp4047 kernel: [798350.776277] [ 32792] 0 32792 1486038 8828 798720 0 0 mtail Nov 10 23:05:17 cp4047 kernel: [798350.776280] [ 85660] 109 85660 27923 5036 118784 0 0 prometheus-traf Nov 10 23:05:17 cp4047 kernel: [798350.776282] [ 65867] 0 65867 597 16 45056 0 0 sh Nov 10 23:05:17 cp4047 kernel: [798350.776285] [ 65868] 0 65868 597 17 49152 0 0 atslog-backend Nov 10 23:05:17 cp4047 kernel: [798350.776291] [ 65869] 0 65869 1283339 7899 692224 0 0 mtail Nov 10 23:05:17 cp4047 kernel: [798350.776293] [ 65870] 0 65870 118583 517 118784 0 0 fifo-log-tailer Nov 10 23:05:17 cp4047 kernel: [798350.776295] [ 58134] 0 58134 769134 4423 413696 0 0 prometheus-rsys Nov 10 23:05:17 cp4047 kernel: [798350.776300] [ 72675] 0 72675 4538 1498 69632 0 0 python3 Nov 10 23:05:17 cp4047 kernel: [798350.776302] [ 72685] 0 72685 21661 20530 204800 0 0 varnishlog Nov 10 23:05:17 cp4047 kernel: [798350.776311] [ 64646] 114 64646 939674 125872 1585152 0 0 haproxy Nov 10 23:05:17 cp4047 kernel: [798350.776317] [ 10713] 109 10713 45345 4099 118784 0 0 prometheus-varn Nov 10 23:05:17 cp4047 kernel: [798350.776324] [ 26889] 0 26889 1682 67 45056 0 0 varnishmtail-wr Nov 10 23:05:17 cp4047 kernel: [798350.776326] [ 26890] 0 26890 1682 65 45056 0 0 varnishmtail-wr Nov 10 23:05:17 cp4047 kernel: [798350.776329] [ 26891] 0 26891 1693 159 53248 0 -500 nrpe Nov 10 23:05:17 cp4047 kernel: [798350.776331] [ 26892] 0 26892 1693 159 53248 0 -500 nrpe Nov 10 23:05:17 cp4047 kernel: [798350.776333] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/system.slice/varnish-frontend. service,task=cache-main,pid=2421,uid=112 Nov 10 23:05:17 cp4047 kernel: [798350.777680] Out of memory: Killed process 2421 (cache-main) total-vm:1282100704kB, anon-rss:495986376kB, file-rss:0kB, shmem-rss:82464kB, UID:1 12 pgtables:2002768kB oom_score_adj:0 Nov 10 23:05:42 cp4047 kernel: [798376.572445] oom_reaper: reaped process 2421 (cache-main), now anon-rss:0kB, file-rss:0kB, shmem-rss:82100kB
This could be potentially related with how we configure memory thresholds for varnish + the fact that in the new upload hardware featuring 12TB of caching storage ATS requires 3x memory than in the old hardware:
https://grafana.wikimedia.org/goto/uip5tYD4z?orgId=1 (cp4047, 12Tb of storage, ATS requires ~20GB of memory):
https://grafana.wikimedia.org/goto/P33TtYv4z?orgId=1 (cp3051, 1.6Tb of storage, ATS requires ~6.5GB of memory: