Page MenuHomePhabricator

Servers freezing across the caching cluster (November 2019)
Open, HighPublic

Description

During the last 9 days three caching nodes went down with the same symptoms:

  • Nothing on the SEL
  • KVM unresponsive
  • Network down
  • Nothing on the logs

A power cycle fixed them.

So far the affected systems are PowerEdge R440:

  • cp3053 - T239041
  • cp1077 - T238289
  • cp3057 - T237348 T239502 T244127
  • cp3065 - T238032 and 2020-01-05
  • db2125 - T239042 Kernel at the time of the crash: Linux db2125 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux
  • cp3063 - T239310
  • cp1087 - T239449
  • cp3055 - T240425 (twice, same task, I think the firmware has not yet been updated)
  • backup2001 - T240177 T237730 T240177#5773711 (crashed 3 times, the second crash happened with the firmware running the latest version)
  • cp3051 - T241306
  • cp3061 - crashed 2019-12-28T23:36

Maybe a kernel upgrade or a CPU microcode update is messing with them?

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
elukey added a subscriber: elukey.Nov 25 2019, 8:04 AM
Marostegui updated the task description. (Show Details)Nov 25 2019, 8:39 AM
Marostegui updated the task description. (Show Details)Nov 25 2019, 8:56 AM
Volans added a subscriber: Volans.Nov 25 2019, 1:52 PM

If needed, full list of R440 available here: https://puppetboard.wikimedia.org/fact/productname/PowerEdge+R440 (intentionally not mentioning their count here)

BBlack added a comment.EditedNov 25 2019, 5:43 PM

It was observed earlier in the traffic meeting that we're fairly certain that none of our R440 hosts have had this problem more than once, so this may be a "once per server" phenomenon, in which case it's also quite likely this can be pre-empted on the ones that haven't crashed yet by giving them a reboot (e.g. something deep has changed while the servers are live, and they stabilize once they've done a fresh boot with it, possibly a live update of some microcode or firmware?)

Vgutierrez updated the task description. (Show Details)Nov 27 2019, 7:05 AM
06:13:59 <+icinga-wm> PROBLEM - Host cp3057 is DOWN: PING CRITICAL - Packet loss = 100

Could be another case of R440 going down?

Vgutierrez updated the task description. (Show Details)

It was observed earlier in the traffic meeting that we're fairly certain that none of our R440 hosts have had this problem more than once, so this may be a "once per server" phenomenon, in which case it's also quite likely this can be pre-empted on the ones that haven't crashed yet by giving them a reboot (e.g. something deep has changed while the servers are live, and they stabilize once they've done a fresh boot with it, possibly a live update of some microcode or firmware?)

Please note that this is no longer the case, cp3057 has been affected twice already in less than a month, see T237348 and T239502

And [10:23:27] <+icinga-wm> PROBLEM - Host cp3053 is DOWN: PING CRITICAL - Packet loss = 100% which already failed: T239041

ema added a comment.Dec 11 2019, 9:12 AM

On 2019-12-10 cp3055 went down too:

19:33 <+icinga-wm> PROBLEM - Host cp3055 is DOWN: PING CRITICAL - Packet loss = 100%

Depooled and power-cycled by @elukey on 2019-12-11T08:04.

ema updated the task description. (Show Details)Dec 11 2019, 9:13 AM

Mentioned in SAL (#wikimedia-operations) [2019-12-11T09:14:47Z] <ema> repool cp3055 T238305

ema updated the task description. (Show Details)Dec 11 2019, 9:19 AM

See: T240177 T237730 backup2001 was updated to new bios last time it crashed.

Do we have somewhere to collect the kernel versions of the hosts and whether they were upgraded before/after the crash?
I upgraded db2125's kernel when it crashed to:

root@db2125:~# uname -a
Linux db2125 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux

And as I listed on the task description, the running kernel at the time of the crash: Linux db2125 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux

Marostegui updated the task description. (Show Details)Dec 11 2019, 9:50 AM

Some observations:

  • I'm pretty sure this is unrelated to the kernel, we've seen these crashes with both 4.9 and 4.19
  • backup2001 had latest firmware when it crashed
  • backup2001 had almost the latest CPU microcode when it crashed (2019-11-12 release, there's a 2019-11-15 release, but some CPUs are failing to reboot with that microcode and there are reports of overheating, last night there was a update in Debian unstable which rolled back the microcode for one CPU type, but 2019-11-15 could be an option to test. Intel doesn't really tell what they changed between 2019-11-12 and 2019-11-15, but they went through the hassle of doing a new release so there must be some reason...

I also found which sounds very similar: https://www.dell.com/community/PowerEdge-OS-Forum/Random-Reboot-R740/td-p/5169703/page/3

I think we could try two different things independenf of each other (to see whether they are effective by itself):

  • Disable the C / C1E states in Performance settings on 2-3 affected servers
  • Upgrade 2-3 affected servers to the 2019-11-15 microcode
ema added a comment.Dec 11 2019, 11:06 AM

See: T240177 T237730 backup2001 was updated to new bios last time it crashed.

cp3053 too (T239041) and has been running fine since, FWIW.

See: T240177 T237730 backup2001 was updated to new bios last time it crashed.

cp3053 too (T239041) and has been running fine since, FWIW.

Answering the too, backup2001 crashed before, and again after being upgraded.

faidon added a subscriber: faidon.Dec 13 2019, 3:55 PM

Note that R440s comprise 23.5% of the whole fleet, 84.1% of all servers purchased in the last 12 months, and 67.5% of all servers purchased in the last 24 months (I wish I had a graph!). Given this sample size, this may be just correlated to R440s and not specifically tied to them.

jcrespo added a comment.EditedDec 13 2019, 4:44 PM

I believe the main concern is that it seems to happen only recent batches, if I am not mistaken. Indeed the correlation could be based on CPU models or something else (kernel version).

Volans updated the task description. (Show Details)Dec 21 2019, 11:27 PM
ema updated the task description. (Show Details)Dec 29 2019, 10:55 AM

Mentioned in SAL (#wikimedia-operations) [2019-12-29T10:57:03Z] <ema> repool cp3061 T238305

ema added a comment.Dec 29 2019, 11:08 AM

cp3061 crashed today, yet another cache_upload node in esams, continuing the trend mentioned in T241306#5759233. DC-Ops: is there anything you can think of that differentiates esams upload hosts, cp30(5[13579]|6[135]), from text cp30(5[02468]|6[024])? An obvious one is network utilization, significantly higher on upload hosts, but maybe there's something else hardware-related that we're overlooking?

ema updated the task description. (Show Details)Dec 29 2019, 11:14 AM
Marostegui updated the task description. (Show Details)Jan 3 2020, 1:14 PM

In going through all the affected systems in this task, I'd like to treat db2125 and backup2001 separately, since they seem like one-offs and could very well be hardware issues. (db2125 was 1 of 10 systems in the batch and backup2001 was ordered 1.5yrs ago) . Even the two cp1077 and cp1087 systems in eqiad have been around since May 2018, so those could also be related or unrelated to the cp crashes in esams as well.

But if we were to focus on just the cp machines in esams, there's nothing from a racking/cabling perspective that we did differently between the odd numbered (upload hosts) and even numbered (text hosts) hostnames. They both share the same racks and the same asw switches. So if only the upload hosts are seeing issues, it seems like it might be something from that side. I can definitely reach out to Dell as well, to see if there are any known firmware issues, etc....though I can tell you the first thing they're going to want us to do beforehand is upgrade firmware on everything, then send them the logs from the diagnostics testing. Let me know if you guys want us to go that route. We could focus on just cp3055 for now too, since that one looks like its firmware has already been upgraded and is still seeing issues.

Thanks,
Willy

Mentioned in SAL (#wikimedia-operations) [2020-01-05T23:56:00Z] <effie> powecycle cp3065.esams.wmnet T238305

Mentioned in SAL (#wikimedia-operations) [2020-01-06T00:06:11Z] <effie> pool cp3065 T238305

ema updated the task description. (Show Details)Jan 6 2020, 1:24 PM

In going through all the affected systems in this task, I'd like to treat db2125 and backup2001 separately, since they seem like one-offs and could very well be hardware issues. (db2125 was 1 of 10 systems in the batch and backup2001 was ordered 1.5yrs ago) . Even the two cp1077 and cp1087 systems in eqiad have been around since May 2018, so those could also be related or unrelated to the cp crashes in esams as well.

backup2001 has crashed 3 times already (even with up-to-date BIOS and firmwares) so I am not fully sure it should be treated separately. The crashes follow the exact same pattern we've seen so far (no OS logs and not HW logs either).
There is not much else we can do without Dell's assistance (T240177#5727654) I think, as there are no logs to provide or send.

Papaul added a subscriber: Papaul.Jan 7 2020, 11:50 PM

backup2001 is at 1.3.7 for BIOS version and the last time we did only the IDRAC upgrade since sometimes when the IDRAC version is not up to date we might not see and log at system crash. so i think let us start by getting all those servers at the latest firmware BIOS and IDRAC and go from there (see comment on T237730 )

Thanks for the clarification. My thoughts were that we upgraded also BIOS. Let's start with that indeed.

ema added a comment.Jan 8 2020, 7:41 AM

sometimes when the IDRAC version is not up to date we might not see and log at system crash

Interesting!

so i think let us start by getting all those servers at the latest firmware BIOS and IDRAC and go from there (see comment on T237730 )

+1, thanks @Papaul

Jan 12 22:51:15 <icinga-wm>	PROBLEM - Host cp3065 is DOWN: PING CRITICAL - Packet loss = 100%
Jan 12 22:53:51 <icinga-wm>	PROBLEM - Host cp3061 is DOWN: PING CRITICAL - Packet loss = 100%

Perhaps a little close together in timing?

Mentioned in SAL (#wikimedia-operations) [2020-01-13T00:22:18Z] <effie> depool and restart cp3065 cp3061 - T238305

jijiki added a subscriber: jijiki.Jan 13 2020, 12:59 AM

prometheus-trafficserver-tls-exporter.service initially failed to start on both cp3065 and cp3061 after reboot

Mentioned in SAL (#wikimedia-operations) [2020-01-18T04:15:53Z] <cdanis> cp3065.mgmt: /admin1-> racadm serveraction hardreset T238305

03:16:58	<+icinga-wm>	PROBLEM - Host cp3065 is DOWN: PING CRITICAL - Packet loss = 100%

Nothing in racadm getsel or racadm lclog view (latter just has me logging in over SSH).

Mentioned in SAL (#wikimedia-operations) [2020-01-19T00:46:30Z] <cdanis> T238305 cp3053.mgmt /admin1-> racadm serveraction hardreset

CDanis added a comment.EditedJan 19 2020, 12:46 AM
22:22:06	<+icinga-wm>	PROBLEM - Host cp3053 is DOWN: PING CRITICAL - Packet loss = 100%

nothing in logs as usual

Is there any action plan to investigate these issues?

Is there any action plan to investigate these issues?

Currently T242579 is our only hope of getting more information about this issue

18:17:56 <+icinga-wm> PROBLEM - Host cp3061 is DOWN: PING CRITICAL - Packet loss = 100%
Might be another case...

Mentioned in SAL (#wikimedia-traffic) [2020-01-20T08:07:05Z] <ema> powercycle cp3061 T238305

Mentioned in SAL (#wikimedia-operations) [2020-01-26T21:38:22Z] <vgutierrez> powercycling cp3051 - T238305

ema added a comment.Feb 3 2020, 9:25 AM

Thanks to netconsole (T242579) we finally managed to get the kernel oops of two upload@esams crashes.

cp3051 crashing:

Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.536600] ------------[ cut here ]------------
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.541394] kernel BUG at /build/linux-sdMcHj/linux-4.9.189/net/core/skbuff.c:1212!
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.549208] invalid opcode: 0000 [#1] SMP
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.553392] Modules linked in: netconsole configfs sctp_diag sctp tcp_diag udp_diag inet_diag binfmt_misc unix_diag cpufreq_conservative cpufreq_powersave cpufre
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579725] WARNING: CPU: 8 PID: 49103 at /build/linux-sdMcHj/linux-4.9.189/net/core/netpoll.c:171 netpoll_poll_dev+0x197/0x1a0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579730] bnxt_poll+0x0/0xd0 [bnxt_en] exceeded budget in poll
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579731] Modules linked in:
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579732]  netconsole configfs sctp_diag sctp tcp_diag udp_diag inet_diag binfmt_misc unix_diag cpufreq_conservative cpufreq_powersave cpufreq_userspace intel_
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579782] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579783]  0000000000000000
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579784]  ffffffffb13353d4 ffff994cbf303738 0000000000000000 ffffffffb107a83b
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579785]  ffff994c9f908060 ffff994cbf303790 ffff997be9fe79c8 0000000000000001
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579787]  ffff997c89056368 ffff997c9f119c00 ffffffffb107a8bfCall Trace:
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579790]  <IRQ>
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579796]  [<ffffffffb13353d4>] ? dump_stack+0x5c/0x78
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579798]  [<ffffffffb107a83b>] ? __warn+0xcb/0xf0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579799]  [<ffffffffb107a8bf>] ? warn_slowpath_fmt+0x5f/0x80
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579802]  [<ffffffffc031dfdf>] ? bnxt_poll+0x7f/0xd0 [bnxt_en]
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579804]  [<ffffffffc031df60>] ? bnxt_poll_work+0x520/0x520 [bnxt_en]
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579805]  [<ffffffffb1533af7>] ? netpoll_poll_dev+0x197/0x1a0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579806]  [<ffffffffb1533c05>] ? netpoll_send_skb_on_dev+0x105/0x270
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579808]  [<ffffffffb153405c>] ? netpoll_send_udp+0x2ec/0x450
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579812]  [<ffffffffc0508bb5>] ? write_msg+0xb5/0xf0 [netconsole]
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579817]  [<ffffffffb10d2081>] ? call_console_drivers.isra.18.constprop.25+0xf1/0x100
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579819]  [<ffffffffb10d2a54>] ? console_unlock+0x404/0x610
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579821]  [<ffffffffb10d2f76>] ? vprintk_emit+0x316/0x4d0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579826]  [<ffffffffb1181e25>] ? printk+0x5a/0x76
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579829]  [<ffffffffb109c237>] ? notifier_call_chain+0x47/0x70
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579832]  [<ffffffffb11061a7>] ? print_modules+0x97/0xc0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579837]  [<ffffffffb1029991>] ? __die+0x91/0xe0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579839]  [<ffffffffb1029d43>] ? die+0x33/0x60
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579841]  [<ffffffffb10274a6>] ? do_error_trap+0x86/0x100
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579845]  [<ffffffffb14fef9b>] ? pskb_expand_head+0x22b/0x230
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579848]  [<ffffffffb15d106d>] ? ip6_pol_route+0x39d/0x730
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579849]  [<ffffffffb15d1420>] ? ip6_pol_route_input+0x20/0x20
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579854]  [<ffffffffb161d17e>] ? invalid_op+0x1e/0x30
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579856]  [<ffffffffb14fef9b>] ? pskb_expand_head+0x22b/0x230
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579858]  [<ffffffffb14ffa5d>] ? __pskb_pull_tail+0x4d/0x3f0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579861]  [<ffffffffb15bf059>] ? ip6_dst_lookup_tail+0x309/0x440
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579864]  [<ffffffffb15f8673>] ? _decode_session6+0x243/0x3d0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579867]  [<ffffffffb15aa704>] ? __xfrm_decode_session+0x34/0x50
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579868]  [<ffffffffb15e274d>] ? icmpv6_route_lookup+0xed/0x1d0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579869]  [<ffffffffb15e3232>] ? icmp6_send+0x672/0xa00
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579872]  [<ffffffffb118a002>] ? free_one_page+0x2a2/0x370
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579874]  [<ffffffffb15fd1d0>] ? icmpv6_send+0x20/0x30
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579877]  [<ffffffffb15e9fd2>] ? ip6_frag_expire+0x112/0x120
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579878]  [<ffffffffb15e9ec0>] ? ip6frag_obj_hashfn+0xb0/0xb0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579880]  [<ffffffffb10e9262>] ? call_timer_fn+0x32/0x120
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579882]  [<ffffffffb10e95d7>] ? run_timer_softirq+0x1d7/0x430
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579884]  [<ffffffffb133e534>] ? timerqueue_add+0x54/0xa0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579886]  [<ffffffffb10eb2c8>] ? enqueue_hrtimer+0x38/0x80
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579887]  [<ffffffffb16200ad>] ? __do_softirq+0x10d/0x2b0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579889]  [<ffffffffb1080e22>] ? irq_exit+0xc2/0xd0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579890]  [<ffffffffb161fb2c>] ? smp_apic_timer_interrupt+0x4c/0x60
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579892]  [<ffffffffb161e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579893]  <EOI>
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579896]  [<ffffffffb122b719>] ? __fget+0x59/0x90
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579897]  [<ffffffffb122bc81>] ? __fget_light+0x21/0x60
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579899]  [<ffffffffb122c553>] ? __fdget_pos+0x13/0x50
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579901]  [<ffffffffb120d94a>] ? SyS_write+0x2a/0xd0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579904]  [<ffffffffb1003b7d>] ? do_syscall_64+0x8d/0x100
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579905]  [<ffffffffb161c3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.579907] ---[ end trace 9267c0b147a37015 ]---
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616709] ------------[ cut here ]------------
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616712] WARNING: CPU: 8 PID: 49103 at /build/linux-sdMcHj/linux-4.9.189/kernel/softirq.c:165 __local_bh_enable_ip+0x6d/0xa0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616712] Modules linked in:
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616713]  netconsole configfs sctp_diag sctp tcp_diag udp_diag inet_diag binfmt_misc unix_diag cpufreq_conservative cpufreq_powersave cpufreq_userspace intel_
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616744] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616744]  0000000000000000
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616745]  ffffffffb13353d4 0000000000000000 0000000000000000 ffffffffb107a83b
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616747]  0000000000000200 0000000000000000 0000000000000000 ffff994ca0be909c
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616748]  ffff9923295372e8 0000000000000000 ffffffffb108066dCall Trace:
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616750]  <IRQ>
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616752]  [<ffffffffb13353d4>] ? dump_stack+0x5c/0x78
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616753]  [<ffffffffb107a83b>] ? __warn+0xcb/0xf0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616755]  [<ffffffffb108066d>] ? __local_bh_enable_ip+0x6d/0xa0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616759]  [<ffffffffb1514098>] ? __dev_queue_xmit+0x2e8/0x790
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616762]  [<ffffffffb1557c3b>] ? ip_finish_output2+0x2cb/0x430
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616764]  [<ffffffffb14faa71>] ? skb_gso_validate_mtu+0x11/0x80
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616766]  [<ffffffffb15589da>] ? ip_output+0x6a/0x100
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616767]  [<ffffffffb10ea5c7>] ? mod_timer+0x177/0x3b0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616768]  [<ffffffffb1558157>] ? ip_local_out+0x17/0x40
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616770]  [<ffffffffb155848f>] ? ip_queue_xmit+0x13f/0x360
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616773]  [<ffffffffb1570e7a>] ? __tcp_transmit_skb+0x52a/0x9b0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616775]  [<ffffffffb15716ca>] ? tcp_write_xmit+0x3ca/0xf90
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616776]  [<ffffffffb15704a3>] ? tcp_current_mss+0x63/0xa0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616778]  [<ffffffffb15722bd>] ? __tcp_push_pending_frames+0x2d/0xd0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616779]  [<ffffffffb156d7cd>] ? tcp_rcv_established+0x24d/0x6c0
Jan 26 21:20:27 ganeti3002 nc.openbsd[14771]: [3097828.616781]  [<ffffffffb152a02d>] ? sk_filter_trim_cap+0x2d/0x290
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616783]  [<ffffffffb1578933>] ? tcp_v4_do_rcv+0x133/0x200
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616784]  [<ffffffffb1579fd9>] ? tcp_v4_rcv+0x889/0x980
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616786]  [<ffffffffb1553007>] ? ip_local_deliver_finish+0x97/0x1c0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616787]  [<ffffffffb15532cb>] ? ip_local_deliver+0x6b/0xf0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616789]  [<ffffffffb154d270>] ? rt_cpu_seq_stop+0x10/0x10
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616790]  [<ffffffffb1579746>] ? tcp_v4_early_demux+0x136/0x140
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616792]  [<ffffffffb1552cd6>] ? ip_rcv_finish+0x176/0x410
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616793]  [<ffffffffb15535e4>] ? ip_rcv+0x294/0x380
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616796]  [<ffffffffb1603e80>] ? packet_rcv+0x40/0x430
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616798]  [<ffffffffb15114ad>] ? __netif_receive_skb_core+0x51d/0xa40
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616801]  [<ffffffffb158f134>] ? inet_gro_receive+0x234/0x270
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616803]  [<ffffffffb1511a4f>] ? netif_receive_skb_internal+0x2f/0xa0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616805]  [<ffffffffb1512898>] ? napi_gro_receive+0xb8/0xe0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616808]  [<ffffffffc031cccd>] ? bnxt_rx_pkt+0x60d/0x1140 [bnxt_en]
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616810]  [<ffffffffc031dbc5>] ? bnxt_poll_work+0x185/0x520 [bnxt_en]
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616813]  [<ffffffffb10d606d>] ? __synchronize_hardirq+0x3d/0x50
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616815]  [<ffffffffc031dfdf>] ? bnxt_poll+0x7f/0xd0 [bnxt_en]
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616816]  [<ffffffffb1533a74>] ? netpoll_poll_dev+0x114/0x1a0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616817]  [<ffffffffb1533c05>] ? netpoll_send_skb_on_dev+0x105/0x270
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616819]  [<ffffffffb153405c>] ? netpoll_send_udp+0x2ec/0x450
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616821]  [<ffffffffc0508bb5>] ? write_msg+0xb5/0xf0 [netconsole]
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616823]  [<ffffffffb10d2081>] ? call_console_drivers.isra.18.constprop.25+0xf1/0x100
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616825]  [<ffffffffb10d2890>] ? console_unlock+0x240/0x610
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616827]  [<ffffffffb10d2f76>] ? vprintk_emit+0x316/0x4d0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616829]  [<ffffffffb1181e25>] ? printk+0x5a/0x76
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616830]  [<ffffffffb109c237>] ? notifier_call_chain+0x47/0x70
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616832]  [<ffffffffb11061a7>] ? print_modules+0x97/0xc0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616834]  [<ffffffffb1029991>] ? __die+0x91/0xe0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616836]  [<ffffffffb1029d43>] ? die+0x33/0x60
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616837]  [<ffffffffb10274a6>] ? do_error_trap+0x86/0x100
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616839]  [<ffffffffb14fef9b>] ? pskb_expand_head+0x22b/0x230
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616841]  [<ffffffffb15d106d>] ? ip6_pol_route+0x39d/0x730
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616842]  [<ffffffffb15d1420>] ? ip6_pol_route_input+0x20/0x20
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616844]  [<ffffffffb161d17e>] ? invalid_op+0x1e/0x30
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616846]  [<ffffffffb14fef9b>] ? pskb_expand_head+0x22b/0x230
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616847]  [<ffffffffb14ffa5d>] ? __pskb_pull_tail+0x4d/0x3f0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616849]  [<ffffffffb15bf059>] ? ip6_dst_lookup_tail+0x309/0x440
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616850]  [<ffffffffb15f8673>] ? _decode_session6+0x243/0x3d0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616852]  [<ffffffffb15aa704>] ? __xfrm_decode_session+0x34/0x50
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616853]  [<ffffffffb15e274d>] ? icmpv6_route_lookup+0xed/0x1d0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616854]  [<ffffffffb15e3232>] ? icmp6_send+0x672/0xa00
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616855]  [<ffffffffb118a002>] ? free_one_page+0x2a2/0x370
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616857]  [<ffffffffb15fd1d0>] ? icmpv6_send+0x20/0x30
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616859]  [<ffffffffb15e9fd2>] ? ip6_frag_expire+0x112/0x120
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616860]  [<ffffffffb15e9ec0>] ? ip6frag_obj_hashfn+0xb0/0xb0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616861]  [<ffffffffb10e9262>] ? call_timer_fn+0x32/0x120
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616862]  [<ffffffffb10e95d7>] ? run_timer_softirq+0x1d7/0x430
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616864]  [<ffffffffb133e534>] ? timerqueue_add+0x54/0xa0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616865]  [<ffffffffb10eb2c8>] ? enqueue_hrtimer+0x38/0x80
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616866]  [<ffffffffb16200ad>] ? __do_softirq+0x10d/0x2b0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616868]  [<ffffffffb1080e22>] ? irq_exit+0xc2/0xd0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616869]  [<ffffffffb161fb2c>] ? smp_apic_timer_interrupt+0x4c/0x60
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616871]  [<ffffffffb161e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616871]  <EOI>
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616873]  [<ffffffffb122b719>] ? __fget+0x59/0x90
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616874]  [<ffffffffb122bc81>] ? __fget_light+0x21/0x60
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616876]  [<ffffffffb122c553>] ? __fdget_pos+0x13/0x50
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616877]  [<ffffffffb120d94a>] ? SyS_write+0x2a/0xd0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616879]  [<ffffffffb1003b7d>] ? do_syscall_64+0x8d/0x100
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616881]  [<ffffffffb161c3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097828.616882] ---[ end trace 9267c0b147a37016 ]---
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.642876]  evdev irqbypass crct10dif_pclmul crc32_pclmul mgag200 ghash_clmulni_intel ttm drm_kms_helper drm i2c_algo_bit mei_me pcspkr iTCO_wdt sg lpc_ich iTCO
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.702769] CPU: 8 PID: 49103 Comm: [ET_NET 60] Tainted: G        W       4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.713397] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.721132] task: ffff997be94e62c0 task.stack: ffffb6f6f4f0c000
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.727239] RIP: 0010:[<ffffffffb14fef9b>]  [<ffffffffb14fef9b>] pskb_expand_head+0x22b/0x230
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.735978] RSP: 0018:ffff994cbf303bf0  EFLAGS: 00010202
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.741486] RAX: 0000000000000002 RBX: 0000000000000541 RCX: 0000000002080020
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.748815] RDX: 000000000000063f RSI: 0000000000000000 RDI: ffff994be29b0300
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.756150] RBP: ffff994cbf303c60 R08: ffff992165f9d14e R09: 0000000000000000
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.763482] R10: 0000000000000000 R11: 0000000000000000 R12: ffff994be29b0300
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.770813] R13: 0000000000000541 R14: ffff994be29b0300 R15: 0000000000000570
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.778150] FS:  00002ad3f5f17700(0000) GS:ffff994cbf300000(0000) knlGS:0000000000000000
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.786440] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.792378] CR2: 00007fadbbd2e000 CR3: 0000005f5a4f2000 CR4: 0000000000760670
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.799708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.807016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.814332] PKRU: 55555554
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.817225] Stack:
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.819432]  0000000000000541 ffff994cbf303c60 ffff994be29b0300 0000000000000541
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.827179]  0000000000000570 0000000000000570 ffffffffb14ffa5d 0000000000000000
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.834920]  ffffffffb15bf059 ffff994be29b0300 ffff994cbf303cd8 0000000000000006
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.842691] Call Trace:
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.845338]  <IRQ> [3097829.847458]  [<ffffffffb14ffa5d>] ? __pskb_pull_tail+0x4d/0x3f0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.853590]  [<ffffffffb15bf059>] ? ip6_dst_lookup_tail+0x309/0x440
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.860035]  [<ffffffffb15f8673>] ? _decode_session6+0x243/0x3d0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.866248]  [<ffffffffb15aa704>] ? __xfrm_decode_session+0x34/0x50
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.872696]  [<ffffffffb15e274d>] ? icmpv6_route_lookup+0xed/0x1d0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.879099]  [<ffffffffb15e3232>] ? icmp6_send+0x672/0xa00
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.884784]  [<ffffffffb118a002>] ? free_one_page+0x2a2/0x370
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.890708]  [<ffffffffb15fd1d0>] ? icmpv6_send+0x20/0x30
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.896310]  [<ffffffffb15e9fd2>] ? ip6_frag_expire+0x112/0x120
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.902420]  [<ffffffffb15e9ec0>] ? ip6frag_obj_hashfn+0xb0/0xb0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.908603]  [<ffffffffb10e9262>] ? call_timer_fn+0x32/0x120
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.914438]  [<ffffffffb10e95d7>] ? run_timer_softirq+0x1d7/0x430
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.920736]  [<ffffffffb133e534>] ? timerqueue_add+0x54/0xa0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.926569]  [<ffffffffb10eb2c8>] ? enqueue_hrtimer+0x38/0x80
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.932517]  [<ffffffffb16200ad>] ? __do_softirq+0x10d/0x2b0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.938383]  [<ffffffffb1080e22>] ? irq_exit+0xc2/0xd0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.943714]  [<ffffffffb161fb2c>] ? smp_apic_timer_interrupt+0x4c/0x60
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.950443]  [<ffffffffb161e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.956839]  <EOI> [3097829.958979]  [<ffffffffb122b719>] ? __fget+0x59/0x90
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.964139]  [<ffffffffb122bc81>] ? __fget_light+0x21/0x60
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.969823]  [<ffffffffb122c553>] ? __fdget_pos+0x13/0x50
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.975410]  [<ffffffffb120d94a>] ? SyS_write+0x2a/0xd0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.980826]  [<ffffffffb1003b7d>] ? do_syscall_64+0x8d/0x100
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.986669]  [<ffffffffb161c3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097829.993840] Code: 00 00 00 49 03 96 d0 00 00 00 e9 5c ff ff ff b8 f4 ff ff ff e9 22 ff ff ff 4c 89 e7 e8 bf c4 ce ff b8 f4 ff ff ff e9 10 ff ff ff <0f> 0b 0f 1f 
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.015607] RIP  [<ffffffffb14fef9b>] pskb_expand_head+0x22b/0x230
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.022016]  RSP <ffff994cbf303bf0>
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.028201] ---[ end trace 9267c0b147a37017 ]---
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.058638] Kernel panic - not syncing: Fatal exception in interrupt
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.065349] Kernel Offset: 0x30000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.101647] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.109091] ------------[ cut here ]------------
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.113897] WARNING: CPU: 8 PID: 49103 at /build/linux-sdMcHj/linux-4.9.189/arch/x86/kernel/smp.c:128 check_preempt_curr+0x4e/0x90
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.125790] Modules linked in: netconsole configfs sctp_diag sctp tcp_diag udp_diag inet_diag binfmt_misc unix_diag cpufreq_conservative cpufreq_powersave cpufre
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.210699] CPU: 8 PID: 49103 Comm: [ET_NET 60] Tainted: G      D W       4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.221287] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.229025]  0000000000000000 ffffffffb13353d4 0000000000000000 0000000000000000
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.236764]  ffffffffb107a83b ffff994cbf218980 ffff997c87472780 ffff994cbf218980
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.244503]  0000000000000004 0000000000000046 ffff994cbf218980 ffffffffb10a506e
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.252240] Call Trace:
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.254879]  <IRQ> [3097830.257000]  [<ffffffffb13353d4>] ? dump_stack+0x5c/0x78
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.262497]  [<ffffffffb107a83b>] ? __warn+0xcb/0xf0
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.267641]  [<ffffffffb10a506e>] ? check_preempt_curr+0x4e/0x90
Jan 26 21:20:28 ganeti3002 nc.openbsd[14771]: [3097830.273821]  [<ffffffffb10a50c4>] ? ttwu_do_wakeup+0x14/0xe0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.279659]  [<ffffffffb10a5e1a>] ? try_to_wake_up+0x18a/0x3c0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.285670]  [<ffffffffb10bdb13>] ? autoremove_wake_function+0x13/0x40
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.292368]  [<ffffffffb10bd53f>] ? __wake_up_common+0x4f/0x90
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.298378]  [<ffffffffb10bd5b4>] ? __wake_up+0x34/0x50
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.303786]  [<ffffffffb1160759>] ? irq_work_run_list+0x49/0x70
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.309882]  [<ffffffffb10fa740>] ? tick_sched_do_timer+0x30/0x30
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.316148]  [<ffffffffb10ead8b>] ? update_process_times+0x3b/0x50
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.322501]  [<ffffffffb10fa140>] ? tick_sched_handle.isra.12+0x20/0x50
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.329289]  [<ffffffffb10fa778>] ? tick_sched_timer+0x38/0x70
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.335299]  [<ffffffffb10eb84e>] ? __hrtimer_run_queues+0xde/0x250
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.341739]  [<ffffffffb10ebf2c>] ? hrtimer_interrupt+0x9c/0x1a0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.347925]  [<ffffffffb161fb27>] ? smp_apic_timer_interrupt+0x47/0x60
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.354624]  [<ffffffffb161e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.360978]  [<ffffffffb1181c04>] ? panic+0x1fc/0x242
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.366207]  [<ffffffffb1181bfd>] ? panic+0x1f5/0x242
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.371442]  [<ffffffffb10298f2>] ? oops_end+0xc2/0xd0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.376757]  [<ffffffffb10274a6>] ? do_error_trap+0x86/0x100
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.382596]  [<ffffffffb14fef9b>] ? pskb_expand_head+0x22b/0x230
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.388780]  [<ffffffffb15d106d>] ? ip6_pol_route+0x39d/0x730
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.394697]  [<ffffffffb15d1420>] ? ip6_pol_route_input+0x20/0x20
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.400967]  [<ffffffffb161d17e>] ? invalid_op+0x1e/0x30
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.406458]  [<ffffffffb14fef9b>] ? pskb_expand_head+0x22b/0x230
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.412640]  [<ffffffffb14ffa5d>] ? __pskb_pull_tail+0x4d/0x3f0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.418736]  [<ffffffffb15bf059>] ? ip6_dst_lookup_tail+0x309/0x440
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.425178]  [<ffffffffb15f8673>] ? _decode_session6+0x243/0x3d0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.431360]  [<ffffffffb15aa704>] ? __xfrm_decode_session+0x34/0x50
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.437801]  [<ffffffffb15e274d>] ? icmpv6_route_lookup+0xed/0x1d0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.444155]  [<ffffffffb15e3232>] ? icmp6_send+0x672/0xa00
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.449821]  [<ffffffffb118a002>] ? free_one_page+0x2a2/0x370
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.455742]  [<ffffffffb15fd1d0>] ? icmpv6_send+0x20/0x30
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.461319]  [<ffffffffb15e9fd2>] ? ip6_frag_expire+0x112/0x120
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.467415]  [<ffffffffb15e9ec0>] ? ip6frag_obj_hashfn+0xb0/0xb0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.473596]  [<ffffffffb10e9262>] ? call_timer_fn+0x32/0x120
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.479431]  [<ffffffffb10e95d7>] ? run_timer_softirq+0x1d7/0x430
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.485703]  [<ffffffffb133e534>] ? timerqueue_add+0x54/0xa0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.491536]  [<ffffffffb10eb2c8>] ? enqueue_hrtimer+0x38/0x80
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.497458]  [<ffffffffb16200ad>] ? __do_softirq+0x10d/0x2b0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.503296]  [<ffffffffb1080e22>] ? irq_exit+0xc2/0xd0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.508613]  [<ffffffffb161fb2c>] ? smp_apic_timer_interrupt+0x4c/0x60
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.515316]  [<ffffffffb161e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.521668]  <EOI> [3097830.523788]  [<ffffffffb122b719>] ? __fget+0x59/0x90
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.528940]  [<ffffffffb122bc81>] ? __fget_light+0x21/0x60
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.534605]  [<ffffffffb122c553>] ? __fdget_pos+0x13/0x50
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.540183]  [<ffffffffb120d94a>] ? SyS_write+0x2a/0xd0
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.545589]  [<ffffffffb1003b7d>] ? do_syscall_64+0x8d/0x100
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.551421]  [<ffffffffb161c3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jan 26 21:20:29 ganeti3002 nc.openbsd[14771]: [3097830.558554] ---[ end trace 9267c0b147a37018 ]---

cp3063 crashing:

Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.696503] ------------[ cut here ]------------
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.701313] kernel BUG at /build/linux-sdMcHj/linux-4.9.189/net/core/skbuff.c:1212!
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.709166] invalid opcode: 0000 [#1] SMP
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.713351] Modules linked in: netconsole configfs binfmt_misc unix_diag cpufreq_conservative cpufreq_userspace cpufreq_powersave intel_rapl skx_edac edac_core s
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.792897] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.801797] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.809534] task: ffff902aa457a040 task.stack: ffffa19e0025c000
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.815611] RIP: 0010:[<ffffffffa6afef9b>]  [<ffffffffa6afef9b>] pskb_expand_head+0x22b/0x230
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.824344] RSP: 0018:ffff902abd403bf0  EFLAGS: 00010202
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.829818] RAX: 0000000000000002 RBX: 0000000000000271 RCX: 0000000002080020
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.837136] RDX: 000000000000036f RSI: 0000000000000000 RDI: ffff8ff694102e00
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.844433] RBP: ffff902abd403c60 R08: ffff8fd3325159ce R09: 0000000000000000
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.851747] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8ff694102e00
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.859059] R13: 0000000000000271 R14: ffff8ff694102e00 R15: 00000000000002a0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.866359] FS:  0000000000000000(0000) GS:ffff902abd400000(0000) knlGS:0000000000000000
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.874609] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.880547] CR2: 00007f87062ff000 CR3: 00000030fd008000 CR4: 0000000000760670
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.887860] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.895183] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.902484] PKRU: 55555554
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.905370] Stack:
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.907556]  0000000000000271 ffff902abd403c60 ffff8ff694102e00 0000000000000271
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.915214]  00000000000002a0 00000000000002a0 ffffffffa6affa5d 0000000000000000
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.922868]  ffffffffa6bbf059 ffff8ff694102e00 ffff902abd403cd8 0000000000000006
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.930520] Call Trace:
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.933157]  <IRQ> [5583162.935286]  [<ffffffffa6affa5d>] ? __pskb_pull_tail+0x4d/0x3f0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.941384]  [<ffffffffa6bbf059>] ? ip6_dst_lookup_tail+0x309/0x440
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.947851]  [<ffffffffa6bf8673>] ? _decode_session6+0x243/0x3d0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.954049]  [<ffffffffa6baa704>] ? __xfrm_decode_session+0x34/0x50
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.960474]  [<ffffffffa6be274d>] ? icmpv6_route_lookup+0xed/0x1d0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.966811]  [<ffffffffa6be3232>] ? icmp6_send+0x672/0xa00
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.972474]  [<ffffffffa66b772c>] ? load_balance+0x1cc/0xa00
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.978297]  [<ffffffffa6bfd1d0>] ? icmpv6_send+0x20/0x30
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.983907]  [<ffffffffa6be9fd2>] ? ip6_frag_expire+0x112/0x120
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.989994]  [<ffffffffa6be9ec0>] ? ip6frag_obj_hashfn+0xb0/0xb0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583162.996217]  [<ffffffffa66e9262>] ? call_timer_fn+0x32/0x120
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.002037]  [<ffffffffa66e95d7>] ? run_timer_softirq+0x1d7/0x430
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.008325]  [<ffffffffa66fa740>] ? tick_sched_do_timer+0x30/0x30
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.014594]  [<ffffffffa693e534>] ? timerqueue_add+0x54/0xa0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.020426]  [<ffffffffa66eb2c8>] ? enqueue_hrtimer+0x38/0x80
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.026341]  [<ffffffffa6c200ad>] ? __do_softirq+0x10d/0x2b0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.032173]  [<ffffffffa6680e22>] ? irq_exit+0xc2/0xd0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.037477]  [<ffffffffa6c1fb2c>] ? smp_apic_timer_interrupt+0x4c/0x60
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.044188]  [<ffffffffa6c1e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.050542]  <EOI> [5583163.052645]  [<ffffffffa6adcba2>] ? cpuidle_enter_state+0xa2/0x2d0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.059031]  [<ffffffffa6adcb90>] ? cpuidle_enter_state+0x90/0x2d0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.065375]  [<ffffffffa66be294>] ? cpu_startup_entry+0x154/0x240
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.071650]  [<ffffffffa664aa50>] ? start_secondary+0x170/0x1b0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.077759] Code: 00 00 00 49 03 96 d0 00 00 00 e9 5c ff ff ff b8 f4 ff ff ff e9 22 ff ff ff 4c 89 e7 e8 bf c4 ce ff b8 f4 ff ff ff e9 10 ff ff ff <0f> 0b 0f 1f 
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.098432] RIP  [<ffffffffa6afef9b>] pskb_expand_head+0x22b/0x230
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.104804]  RSP <ffff902abd403bf0>
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.110961] ---[ end trace cda9b97aff5419be ]---
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.139307] Kernel panic - not syncing: Fatal exception in interrupt
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.145971] Kernel Offset: 0x25600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.180567] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.187864] ------------[ cut here ]------------
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.192652] WARNING: CPU: 1 PID: 0 at /build/linux-sdMcHj/linux-4.9.189/arch/x86/kernel/smp.c:128 update_process_times+0x40/0x50
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.204347] Modules linked in: netconsole configfs binfmt_misc unix_diag cpufreq_conservative cpufreq_userspace cpufreq_powersave intel_rapl skx_edac edac_core s
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.283749] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D         4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.293801] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.301523]  0000000000000000 ffffffffa69353d4 0000000000000000 0000000000000000
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.309174]  ffffffffa667a83b ffff902aa457a040 0000000000000000 ffff902abd403938
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.316825]  ffffffffa66fa740 0000000000000003 ffff902abd414ce8 ffffffffa66ead90
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.324480] Call Trace:
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.327101]  <IRQ> [5583163.329205]  [<ffffffffa69353d4>] ? dump_stack+0x5c/0x78
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.334682]  [<ffffffffa667a83b>] ? __warn+0xcb/0xf0
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.339811]  [<ffffffffa66fa740>] ? tick_sched_do_timer+0x30/0x30
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.346061]  [<ffffffffa66ead90>] ? update_process_times+0x40/0x50
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.352400]  [<ffffffffa66fa140>] ? tick_sched_handle.isra.12+0x20/0x50
Jan 31 01:40:23 ganeti3002 nc.openbsd[14771]: [5583163.359168]  [<ffffffffa66fa778>] ? tick_sched_timer+0x38/0x70
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.365161]  [<ffffffffa66eb84e>] ? __hrtimer_run_queues+0xde/0x250
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.371586]  [<ffffffffa66ebf2c>] ? hrtimer_interrupt+0x9c/0x1a0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.377752]  [<ffffffffa6c1fb27>] ? smp_apic_timer_interrupt+0x47/0x60
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.384434]  [<ffffffffa6c1e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.390772]  [<ffffffffa6781c04>] ? panic+0x1fc/0x242
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.395984]  [<ffffffffa6781bfd>] ? panic+0x1f5/0x242
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.401202]  [<ffffffffa66298f2>] ? oops_end+0xc2/0xd0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.406498]  [<ffffffffa66274a6>] ? do_error_trap+0x86/0x100
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.412318]  [<ffffffffa6afef9b>] ? pskb_expand_head+0x22b/0x230
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.418483]  [<ffffffffa6bd106d>] ? ip6_pol_route+0x39d/0x730
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.424387]  [<ffffffffa6bd1420>] ? ip6_pol_route_input+0x20/0x20
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.430641]  [<ffffffffa6c1d17e>] ? invalid_op+0x1e/0x30
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.436113]  [<ffffffffa6afef9b>] ? pskb_expand_head+0x22b/0x230
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.442277]  [<ffffffffa6affa5d>] ? __pskb_pull_tail+0x4d/0x3f0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.448358]  [<ffffffffa6bbf059>] ? ip6_dst_lookup_tail+0x309/0x440
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.454782]  [<ffffffffa6bf8673>] ? _decode_session6+0x243/0x3d0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.460947]  [<ffffffffa6baa704>] ? __xfrm_decode_session+0x34/0x50
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.467368]  [<ffffffffa6be274d>] ? icmpv6_route_lookup+0xed/0x1d0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.473708]  [<ffffffffa6be3232>] ? icmp6_send+0x672/0xa00
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.479354]  [<ffffffffa66b772c>] ? load_balance+0x1cc/0xa00
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.485173]  [<ffffffffa6bfd1d0>] ? icmpv6_send+0x20/0x30
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.490732]  [<ffffffffa6be9fd2>] ? ip6_frag_expire+0x112/0x120
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.496809]  [<ffffffffa6be9ec0>] ? ip6frag_obj_hashfn+0xb0/0xb0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.502975]  [<ffffffffa66e9262>] ? call_timer_fn+0x32/0x120
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.508793]  [<ffffffffa66e95d7>] ? run_timer_softirq+0x1d7/0x430
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.515044]  [<ffffffffa66fa740>] ? tick_sched_do_timer+0x30/0x30
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.521297]  [<ffffffffa693e534>] ? timerqueue_add+0x54/0xa0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.527117]  [<ffffffffa66eb2c8>] ? enqueue_hrtimer+0x38/0x80
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.533021]  [<ffffffffa6c200ad>] ? __do_softirq+0x10d/0x2b0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.538841]  [<ffffffffa6680e22>] ? irq_exit+0xc2/0xd0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.544141]  [<ffffffffa6c1fb2c>] ? smp_apic_timer_interrupt+0x4c/0x60
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.550825]  [<ffffffffa6c1e25e>] ? apic_timer_interrupt+0x9e/0xb0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.557161]  <EOI> [5583163.559264]  [<ffffffffa6adcba2>] ? cpuidle_enter_state+0xa2/0x2d0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.565610]  [<ffffffffa6adcb90>] ? cpuidle_enter_state+0x90/0x2d0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.571948]  [<ffffffffa66be294>] ? cpu_startup_entry+0x154/0x240
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.578200]  [<ffffffffa664aa50>] ? start_secondary+0x170/0x1b0
Jan 31 01:40:24 ganeti3002 nc.openbsd[14771]: [5583163.584276] ---[ end trace cda9b97aff5419bf ]---

We also captured two warnings. No related crash, the hosts are still running.

cp3061:

Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696388] ------------[ cut here ]------------
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696395] WARNING: CPU: 16 PID: 1 at /build/linux-sdMcHj/linux-4.9.189/net/core/netpoll.c:171 netpoll_poll_dev+0x197/0x1a0
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696400] bnxt_poll+0x0/0xd0 [bnxt_en] exceeded budget in poll
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696438] Modules linked in: nfnetlink_queue nfnetlink_log nfnetlink netconsole configfs unix_diag binfmt_misc cpufreq_userspace cpufreq_powersave cpufreq_conse
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696446]  lrw gf128mul ablk_helper xhci_hcd cryptd nvme libata nvme_core bnxt_en i2c_i801 usbcore i2c_smbus scsi_mod usb_common [last unloaded: netconsole]
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696449] CPU: 16 PID: 1 Comm: systemd Not tainted 4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696450] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696452]  0000000000000000 ffffffff8b1353d4 ffffabbfc01ffa90 0000000000000000
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696454]  ffffffff8ae7a83b ffff92bbdf94a060 ffffabbfc01ffae8 ffff92dc38755848
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696455]  0000000000000001 ffff92b1da8b3168 ffff92bbe14a8600 ffffffff8ae7a8bf
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696456] Call Trace:
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696463]  [<ffffffff8b1353d4>] ? dump_stack+0x5c/0x78
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696465]  [<ffffffff8ae7a83b>] ? __warn+0xcb/0xf0
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696467]  [<ffffffff8ae7a8bf>] ? warn_slowpath_fmt+0x5f/0x80
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696469]  [<ffffffffc027bfdf>] ? bnxt_poll+0x7f/0xd0 [bnxt_en]
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696471]  [<ffffffffc027bf60>] ? bnxt_poll_work+0x520/0x520 [bnxt_en]
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696472]  [<ffffffff8b333af7>] ? netpoll_poll_dev+0x197/0x1a0
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696474]  [<ffffffff8b333c05>] ? netpoll_send_skb_on_dev+0x105/0x270
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696475]  [<ffffffff8b33405c>] ? netpoll_send_udp+0x2ec/0x450
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696480]  [<ffffffffc03e8bb5>] ? write_msg+0xb5/0xf0 [netconsole]
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696486]  [<ffffffff8aed2081>] ? call_console_drivers.isra.18.constprop.25+0xf1/0x100
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696488]  [<ffffffff8aed2890>] ? console_unlock+0x240/0x610
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696489]  [<ffffffff8aed2f76>] ? vprintk_emit+0x316/0x4d0
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696493]  [<ffffffff8af81e83>] ? printk_emit+0x42/0x5e
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696496]  [<ffffffff8b13ec8b>] ? simple_strtoull+0x3b/0x70
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696498]  [<ffffffff8aed3244>] ? devkmsg_write+0x114/0x170
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696502]  [<ffffffff8b00b1cb>] ? do_iter_readv_writev+0xbb/0x140
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696504]  [<ffffffff8b00c75e>] ? do_readv_writev+0x19e/0x240
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696506]  [<ffffffff8b00cab6>] ? do_writev+0x66/0x110
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696509]  [<ffffffff8ae03b7d>] ? do_syscall_64+0x8d/0x100
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696513]  [<ffffffff8b41c3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Jan 22 17:07:32 ganeti3002 nc.openbsd[14771]: [204596.696514] ---[ end trace a1ec19133ac0df5c ]---

And on cp3059:

Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758692] ------------[ cut here ]------------
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758698] WARNING: CPU: 22 PID: 0 at /build/linux-sdMcHj/linux-4.9.189/net/core/netpoll.c:171 netpoll_poll_dev+0x197/0x1a0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758704] bnxt_poll+0x0/0xd0 [bnxt_en] exceeded budget in poll
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758745] Modules linked in: netconsole configfs sctp_diag sctp tcp_diag udp_diag inet_diag unix_diag binfmt_misc intel_rapl cpufreq_conservative cpufreq_users
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758751]  ablk_helper xhci_pci cryptd nvme libata xhci_hcd i2c_i801 nvme_core bnxt_en i2c_smbus usbcore scsi_mod usb_common
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758754] CPU: 22 PID: 0 Comm: swapper/22 Not tainted 4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u1
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758755] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 2.2.11 06/14/2019
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758758]  0000000000000000 ffffffffbe1353d4 ffff932b7f4c37d8 0000000000000000
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758759]  ffffffffbde7a83b ffff932b64076060 ffff932b7f4c3830 ffff9342eddcab48
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758761]  000000000000000c ffff935b5ead3a68 ffff932b54443d00 ffffffffbde7a8bf
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758761] Call Trace:
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758767]  <IRQ>
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758767]  [<ffffffffbe1353d4>] ? dump_stack+0x5c/0x78
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758769]  [<ffffffffbde7a83b>] ? __warn+0xcb/0xf0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758771]  [<ffffffffbde7a8bf>] ? warn_slowpath_fmt+0x5f/0x80
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758773]  [<ffffffffc038ffdf>] ? bnxt_poll+0x7f/0xd0 [bnxt_en]
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758775]  [<ffffffffc038ff60>] ? bnxt_poll_work+0x520/0x520 [bnxt_en]
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758776]  [<ffffffffbe333af7>] ? netpoll_poll_dev+0x197/0x1a0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758778]  [<ffffffffbe333c05>] ? netpoll_send_skb_on_dev+0x105/0x270
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758779]  [<ffffffffbe33405c>] ? netpoll_send_udp+0x2ec/0x450
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758782]  [<ffffffffc0607bb5>] ? write_msg+0xb5/0xf0 [netconsole]
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758787]  [<ffffffffbded2081>] ? call_console_drivers.isra.18.constprop.25+0xf1/0x100
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758789]  [<ffffffffbded2890>] ? console_unlock+0x240/0x610
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758791]  [<ffffffffbded2f76>] ? vprintk_emit+0x316/0x4d0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758794]  [<ffffffffbdf81e25>] ? printk+0x5a/0x76
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758796]  [<ffffffffbdea4921>] ? get_nohz_timer_target+0x91/0xf0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758798]  [<ffffffffbe365d2f>] ? tcp_parse_options+0x2ff/0x320
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758799]  [<ffffffffbe368d98>] ? tcp_conn_request+0x1f8/0xb50
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758803]  [<ffffffffbdf61f36>] ? __bpf_prog_run+0xa76/0x1110
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758805]  [<ffffffffbdf61f36>] ? __bpf_prog_run+0xa76/0x1110
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758807]  [<ffffffffbe36e07b>] ? tcp_rcv_state_process+0x1cb/0xe30
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758809]  [<ffffffffbe32a02d>] ? sk_filter_trim_cap+0x2d/0x290
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758811]  [<ffffffffbe3788a7>] ? tcp_v4_do_rcv+0xa7/0x200
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758813]  [<ffffffffbe37a099>] ? tcp_v4_rcv+0x949/0x980
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758815]  [<ffffffffbe353007>] ? ip_local_deliver_finish+0x97/0x1c0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758817]  [<ffffffffbe3532cb>] ? ip_local_deliver+0x6b/0xf0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758818]  [<ffffffffbe352c09>] ? ip_rcv_finish+0xa9/0x410
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758819]  [<ffffffffbe3535e4>] ? ip_rcv+0x294/0x380
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758822]  [<ffffffffbe403e80>] ? packet_rcv+0x40/0x430
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758824]  [<ffffffffbe3114ad>] ? __netif_receive_skb_core+0x51d/0xa40
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758826]  [<ffffffffc038fbc5>] ? bnxt_poll_work+0x185/0x520 [bnxt_en]
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758828]  [<ffffffffbe370637>] ? tcp_schedule_loss_probe+0x17/0x1a0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758830]  [<ffffffffbe312b54>] ? process_backlog+0x84/0x130
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758831]  [<ffffffffbe3122c6>] ? net_rx_action+0x246/0x380
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758833]  [<ffffffffbe4200ad>] ? __do_softirq+0x10d/0x2b0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758835]  [<ffffffffbde80e22>] ? irq_exit+0xc2/0xd0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758837]  [<ffffffffbe41f137>] ? do_IRQ+0x57/0xe0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758838]  [<ffffffffbe41ccde>] ? common_interrupt+0x9e/0x9e
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758841]  <EOI>
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758841]  [<ffffffffbe2dcba2>] ? cpuidle_enter_state+0xa2/0x2d0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758843]  [<ffffffffbe2dcb90>] ? cpuidle_enter_state+0x90/0x2d0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758845]  [<ffffffffbdebe294>] ? cpu_startup_entry+0x154/0x240
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758847]  [<ffffffffbde4aa50>] ? start_secondary+0x170/0x1b0
Jan 27 05:50:49 ganeti3002 nc.openbsd[14771]: [8168053.758848] ---[ end trace 827763d59c726a09 ]---
ema added a comment.EditedFeb 3 2020, 9:41 AM

Source code taken from linux-source-4.9 4.9.189-3+deb9u2, the crash is at net/core/skbuff.c:1212 (see ema@boron.eqiad.wmnet:~/linux-source-4.9):

1185 /**
1186  *  pskb_expand_head - reallocate header of &sk_buff
1187  *  @skb: buffer to reallocate
1188  *  @nhead: room to add at head
1189  *  @ntail: room to add at tail
1190  *  @gfp_mask: allocation priority
1191  *
1192  *  Expands (or creates identical copy, if @nhead and @ntail are zero)
1193  *  header of @skb. &sk_buff itself is not changed. &sk_buff MUST have
1194  *  reference count of 1. Returns zero in the case of success or error,
1195  *  if expansion failed. In the last case, &sk_buff is not changed.
1196  *
1197  *  All the pointers pointing into skb header may change and must be
1198  *  reloaded after call to this function.
1199  */
1200 
1201 int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
1202              gfp_t gfp_mask)
1203 {
1204     int i;
1205     u8 *data;
1206     int size = nhead + skb_end_offset(skb) + ntail;
1207     long off;
1208 
1209     BUG_ON(nhead < 0);
1210 
1211     if (skb_shared(skb))
1212         BUG();

In turn, skb_shared() looks like this:

/**
 *  skb_shared - is the buffer shared
 *  @skb: buffer to check
 *
 *  Returns true if more than one person has a reference to this
 *  buffer.
 */
static inline int skb_shared(const struct sk_buff *skb)
{
    return atomic_read(&skb->users) != 1;
}

Mentioned in SAL (#wikimedia-operations) [2020-02-03T11:38:00Z] <ema> powercycle cp3057 T244127 T238305

ema updated the task description. (Show Details)Feb 3 2020, 11:50 AM

Mentioned in SAL (#wikimedia-operations) [2020-02-09T05:11:02Z] <cdanis> T238305 hardreset cp3051

faidon added a comment.Wed, Apr 1, 9:39 PM

What's the latest here? I haven't heard about these crashes lately but it may just be that I missed it. Do we know more about this now?

Also, it's great to hear that we have a traceback now! So it looks like it's something at the NIC driver, and most likely a kernel bug rather than a firmware bug. It looks like these are all from 4.9 kernels, which is a bit dated by now. Perhaps we're lucky and whatever this is, it's fixed in buster's 4.19?

Krinkle renamed this task from servers freeze across the caching cluster to Servers freezing across the caching cluster (November 2019).Wed, Apr 1, 10:52 PM

@faidon actually the cp hosts are running buster (T242093) since February 13th. I do believe we haven't seen more occurrences of this issue on the cache cluster since the upgrade

faidon added a comment.Thu, Apr 2, 7:35 PM

Ah! That's awesome to hear. May I suggest to resolve this (and the associated "upgrade firmware"?) task then, and reopen if we have another one of these?