Page MenuHomePhabricator

ms-be1062 fell off the network, causing swift timeouts
Open, HighPublic

Description

ms-be1062 couldn't be reached on the network anymore, I found the following backtrace via the serial console

[3948961.254544] ------------[ cut here ]------------
[3948961.259334] kernel BUG at /build/linux-oA5nb9/linux-4.9.258/lib/swiotlb.c:470!
[3948961.266734] invalid opcode: 0000 [#1] SMP
[3948961.270908] Modules linked in: binfmt_misc ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_CT iptable_raw xt_NFLOG xt_limit xt_tcpudp xt_pkttype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack iptable_filter nfnetlink_log nfnetlink xfs intel_rapl skx_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp sparse_keymap dell_smbios kvm video dcdbas irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel mgag200 ttm drm_kms_helper sg drm i2c_algo_bit pcspkr lpc_ich mei_me iTCO_wdt evdev iTCO_vendor_support mfd_core shpchp mei wmi ipmi_si button ipmi_devintf ipmi_msghandler nf_conntrack ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache sd_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic
[3948961.343165]  raid1 raid0 multipath linear md_mod ahci crc32c_intel libahci xhci_pci aesni_intel aes_x86_64 glue_helper lrw gf128mul tg3 megaraid_sas ablk_helper xhci_hcd ptp libata cryptd bnxt_en pps_core i2c_i801 usbcore i2c_smbus libphy scsi_mod usb_common
[3948961.366775] CPU: 2 PID: 224147 Comm: ethtool Not tainted 4.9.0-15-amd64 #1 Debian 4.9.258-1
[3948961.375271] Hardware name: Dell Inc. PowerEdge R740xd2/0C2PJH, BIOS 2.9.3 09/23/2020
[3948961.383165] task: ffff97ea03ec6500 task.stack: ffffbd8eff238000
[3948961.389243] RIP: 0010:[<ffffffffa1b60ee0>]  [<ffffffffa1b60ee0>] swiotlb_tbl_map_single+0x290/0x2a0
[3948961.398459] RSP: 0018:ffffbd8eff23ba10  EFLAGS: 00010246
[3948961.403934] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[3948961.411223] RDX: 0000000000000000 RSI: 00000000643ff000 RDI: 0000000000000000
[3948961.418511] RBP: 00000000000c87fe R08: 0000000000000002 R09: fffffffffff22a14
[3948961.425802] R10: 0000000000000000 R11: 0000000000000002 R12: ffffffffffffffff
[3948961.433089] R13: 0000000000000001 R14: 0000000000200000 R15: ffffffffffffffff
[3948961.440378] FS:  00007fc7fc01c700(0000) GS:ffff97eebe040000(0000) knlGS:0000000000000000
[3948961.448617] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[3948961.454523] CR2: 00007fc7fb8578a0 CR3: 0000001abe95c000 CR4: 0000000000760670
[3948961.461812] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[3948961.469100] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[3948961.476389] PKRU: 55555554
[3948961.479268] Stack:
[3948961.481456]  0000000000000000 ffff97eea23b70a0 ffff97ea03ec6500 0000000000000000
[3948961.489083]  0000000000000000 00000002a19dfff1 ffff97eea23b70a0 0000000000000000
[3948961.496709]  ffffffffffffffff 0000000000000000 ffff97eea23b70a0 ffffffffffffffff
[3948961.504324] Call Trace:
[3948961.506941]  [<ffffffffa1b610ad>] ? map_single+0x2d/0x80
[3948961.512414]  [<ffffffffa1b61932>] ? swiotlb_alloc_coherent+0xd2/0x150
[3948961.519021]  [<ffffffffc01e11da>] ? bnxt_get_nvram_item+0xda/0x220 [bnxt_en]
[3948961.526237]  [<ffffffffc01e1769>] ? bnxt_get_drvinfo+0x109/0x2f0 [bnxt_en]
[3948961.533268]  [<ffffffffa1d17db3>] ? ethtool_get_drvinfo+0x83/0x1c0
[3948961.539621]  [<ffffffffa1d1af00>] ? dev_ethtool+0x16c0/0x21d0
[3948961.545527]  [<ffffffffa1a02bfe>] ? memcg_kmem_charge_memcg+0x8e/0xc0
[3948961.552134]  [<ffffffffa1d2f656>] ? dev_ioctl+0x186/0x5d0
[3948961.557692]  [<ffffffffa19e7d57>] ? cache_grow_end+0xa7/0xc0
[3948961.563512]  [<ffffffffa19ea6fc>] ? kmem_cache_alloc+0x11c/0x530
[3948961.569678]  [<ffffffffa1cf24c1>] ? sock_do_ioctl+0x41/0x50
[3948961.575408]  [<ffffffffa1cf29bb>] ? sock_ioctl+0x1cb/0x290
[3948961.581056]  [<ffffffffa1a21832>] ? do_vfs_ioctl+0xa2/0x620
[3948961.586797]  [<ffffffffa1a2cf0c>] ? __fd_install+0x2c/0xc0
[3948961.592449]  [<ffffffffa1cf2367>] ? sock_alloc_file+0xa7/0x140
[3948961.598443]  [<ffffffffa1a21e24>] ? SyS_ioctl+0x74/0x80
[3948961.603831]  [<ffffffffa1803b7d>] ? do_syscall_64+0x8d/0x100
[3948961.609652]  [<ffffffffa1e2104e>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6
[3948961.616762] Code: 48 8d 3c 03 48 01 c6 e8 cf 3b fe ff e9 c4 fe ff ff 48 8b 14 24 48 8b 7c 24 08 48 c7 c6 88 97 22 a2 e8 05 18 12 00 e9 aa fe ff ff <0f> 0b 48 c7 c7 f8 96 22 a2 e8 67 fc 2a 00 66 90 41 54 55 48 8d 
[3948961.637229] RIP  [<ffffffffa1b60ee0>] swiotlb_tbl_map_single+0x290/0x2a0
[3948961.644113]  RSP <ffffbd8eff23ba10>
[3948961.652129] ---[ end trace 7077c758da3eff7a ]---