Page MenuHomePhabricator
Paste P9042

cloudvirt1015 crashing after a new mainboard on 2019-09-04
ActivePublic

Authored by Andrew on Sep 4 2019, 8:20 PM.
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.324098] INFO: rcu_sched detected stalls on CPUs/tasks:
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.330239] 39-...: (6 GPs behind) idle=5f1/140000000000000/0 softirq=35610/35612 fqs=2510
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.339659] (detected by 20, t=5256 jiffies, g=36760, c=36759, q=20573)
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347149] Task dump for CPU 39:
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347151] qemu-system-x86 R running task 0 9835 1 0x00000008
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347157] ffff9495e5826080 ffff9495ff0d8980 ffff9455dedd1100 0000000000000000
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347160] ffff9495e3744c00 ffffb091bd4339f0 ffffffff8b416a61 ffffffff8aeeb6f4
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347162] 0000000000000040 ffff9495ff0d8980 00ff9455dedd18d8 ffff9455dedd1100
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347165] Call Trace:
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347173] [<ffffffff8b416a61>] ? __schedule+0x241/0x6f0
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347177] [<ffffffff8aeeb6f4>] ? hrtimer_start_range_ns+0x194/0x360
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347179] [<ffffffff8aeeb117>] ? hrtimer_try_to_cancel+0x27/0x110
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347181] [<ffffffff8b41a795>] ? schedule_hrtimeout_range_clock+0xc5/0x1a0
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347183] [<ffffffff8aeeaee0>] ? __hrtimer_init+0xa0/0xa0
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347185] [<ffffffff8b41b542>] ? _raw_spin_lock_irqsave+0x32/0x40
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347189] [<ffffffff8aebd474>] ? remove_wait_queue+0x14/0x30
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347192] [<ffffffff8b02133d>] ? poll_freewait+0x3d/0xa0
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347194] [<ffffffff8b02293d>] ? do_sys_poll+0x37d/0x560
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347197] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347199] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347201] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347203] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347205] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347206] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347208] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347210] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347212] [<ffffffff8b0215b0>] ? poll_select_copy_remaining+0x150/0x150
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347213] [<ffffffff8b022e28>] ? SyS_ppoll+0x168/0x190
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347217] [<ffffffff8ae03b7d>] ? do_syscall_64+0x8d/0x100
Sep 4 20:11:50 cloudvirt1015 kernel: [ 1028.347219] [<ffffffff8b41b80e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Sep 4 20:12:01 cloudvirt1015 CRON[39719]: (prometheus) CMD (/usr/local/bin/prometheus-puppet-agent-stats --outfile /var/lib/prometheus/node.d/puppet_agent.prom)
Sep 4 20:12:19 cloudvirt1015 nova-compute[1997]: 2019-09-04 20:12:19.113 1997 INFO nova.compute.resource_tracker [req-a34525e4-366a-456d-b9e2-00af9318be44 - - - - -] Auditing locally available compute resources for node cloudvirt1015.eqiad.wmnet
Message from syslogd@cloudvirt1015 at Sep 4 20:12:19 ...
kernel:[ 1057.561642] NMI watchdog: BUG: soft lockup - CPU#69 stuck for 22s! [sshd:39668]
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.561642] NMI watchdog: BUG: soft lockup - CPU#69 stuck for 22s! [sshd:39668]
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569806] Modules linked in: ebt_arp ebt_among ip6table_raw ip6table_mangle nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat xt_connmark iptable_mangle xt_mac xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_comment xt_physdev xt_set xt_conntrack nf_conntrack ip_set_hash_net ip_set nfnetlink vhost_net vhost macvtap macvlan tun ip6table_filter ip6_tables iptable_filter ebtable_filter ebtables binfmt_misc iptable_raw 8021q garp mrp xfs intel_rapl sb_edac edac_core x86_pkg_temp_thermal iTCO_wdt mgag200 iTCO_vendor_support mxm_wmi dcdbas ttm intel_powerclamp drm_kms_helper coretemp drm kvm_intel i2c_algo_bit sg kvm irqbypass pcspkr crct10dif_pclmul evdev mei_me crc32_pclmul lpc_ich ghash_clmulni_intel mfd_core mei shpchp ipmi_si wmi button ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569858] libiscsi_tcp libiscsi scsi_transport_iscsi nbd ipmi_devintf ipmi_msghandler br_netfilter bridge stp llc ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 raid0 multipath linear md_mod dm_mod sd_mod ahci libahci ehci_pci aesni_intel aes_x86_64 glue_helper bnx2x tg3 lrw gf128mul ehci_hcd ablk_helper ptp cryptd pps_core libata megaraid_sas mdio libcrc32c usbcore crc32c_generic usb_common libphy scsi_mod crc32c_intel
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569899] CPU: 69 PID: 39668 Comm: sshd Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u5
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569900] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.8.0 005/17/2018
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569902] task: ffff9449256f0100 task.stack: ffffb0919b7d8000
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569903] RIP: 0010:[<ffffffff8aefee20>] [<ffffffff8aefee20>] smp_call_function_many+0x1f0/0x250
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569913] RSP: 0018:ffffb0919b7dbbe0 EFLAGS: 00000202
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569914] RAX: 0000000000000003 RBX: 0000000000000200 RCX: 0000000000000027
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569915] RDX: ffffd0913f8c3da0 RSI: 0000000000000200 RDI: ffff9495ff4998c8
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569916] RBP: ffff9495ff4998c8 R08: ffffff8000000000 R09: ffffffffffffffff
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569917] R10: 0000000000000008 R11: 0000000000000010 R12: ffff9495ff4998c0
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569918] R13: ffffffff8ae696f0 R14: 0000000000000000 R15: 0000000000000001
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569921] FS: 00007f4bb07818c0(0000) GS:ffff9495ff480000(0000) knlGS:0000000000000000
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569922] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569923] CR2: 00007ffd9ece5f0e CR3: 0000003847f9c000 CR4: 0000000000362670
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569924] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569925] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569925] Stack:
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569927] 0000000000019880 000000019b7dbc88 ffff945478341900 ffffffff8ae696f0
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569929] 0000000000000000 ffffb0919b7dbcd8 ffffb0919b7dbce0 ffff9495ff4d6d08
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569932] ffffffff8aeff0b8 ffff945478341900 ffffb0919b7dbc88 ffff9495ff4d6cd8
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569934] Call Trace:
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569942] [<ffffffff8ae696f0>] ? leave_mm+0xa0/0xa0
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569944] [<ffffffff8aeff0b8>] ? on_each_cpu+0x28/0x60
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569946] [<ffffffff8ae69e38>] ? flush_tlb_kernel_range+0x48/0x90
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569949] [<ffffffff8afca516>] ? __purge_vmap_area_lazy+0x66/0x300
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569950] [<ffffffff8afca764>] ? __purge_vmap_area_lazy+0x2b4/0x300
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569952] [<ffffffff8afca8c4>] ? vm_unmap_aliases+0x114/0x140
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569954] [<ffffffff8ae65313>] ? change_page_attr_set_clr+0xe3/0x4a0
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569957] [<ffffffff8b32ab99>] ? bpf_convert_filter+0x9a9/0xa00
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569959] [<ffffffff8ae6623d>] ? set_memory_ro+0x2d/0x40
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569964] [<ffffffff8af60e35>] ? bpf_prog_select_runtime+0x25/0xd0
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569966] [<ffffffff8b32ceae>] ? bpf_prepare_filter+0x38e/0x410
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569972] [<ffffffff8afa5272>] ? kmemdup+0x32/0x40
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569974] [<ffffffff8b32d08c>] ? bpf_prog_create_from_user+0xbc/0x110
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569977] [<ffffffff8af2bfc0>] ? watchdog_nmi_disable+0x60/0x60
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569979] [<ffffffff8af2c571>] ? do_seccomp+0x121/0x620
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569981] [<ffffffff8ae902bc>] ? SyS_prctl+0x15c/0x490
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569984] [<ffffffff8ae03b7d>] ? do_syscall_64+0x8d/0x100
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569989] [<ffffffff8b41b80e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
Sep 4 20:12:19 cloudvirt1015 kernel: [ 1057.569990] Code: 48 63 d2 e8 53 ca 24 00 3b 05 61 ac c1 00 89 c1 0f 8d 93 fe ff ff 48 98 49 8b 14 24 48 03 14 c5 00 f4 86 8b 8b 42 18 a8 01 74 09 <f3> 90 8b 42 18 a8 01 75 f7 eb bf 0f b6 4c 24 0c 48 83 c4 10 4c
Message from syslogd@cloudvirt1015 at Sep 4 20:12:27 ...
kernel:[ 1065.166255] NMI watchdog: BUG: soft lockup - CPU#20 stuck for 23s! [kworker/20:1:499]
Timeout, server cloudvirt1015.eqiad.wmnet not responding.