At around 2023-12-14T07:39:02, the Cassandra instance restbase2029-a was killed by the kernel (OOM). It was eventually restarted by Puppet and returned to service at approximately 2023-12-14T07:56:54.
This host was recently added as part of a (currently on-going) refresh; It has only been online a few days.
1 | [Dec14 07:36] ReadStage-2 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 |
---|---|
2 | [ +0.000007] CPU: 45 PID: 1947633 Comm: ReadStage-2 Not tainted 5.10.0-26-amd64 #1 Debian 5.10.197-1 |
3 | [ +0.000002] Hardware name: Dell Inc. PowerEdge R450/073H50, BIOS 1.11.2 08/10/2023 |
4 | [ +0.000001] Call Trace: |
5 | [ +0.000012] dump_stack+0x6b/0x83 |
6 | [ +0.000005] dump_header+0x4a/0x1f4 |
7 | [ +0.000002] oom_kill_process.cold+0xb/0x10 |
8 | [ +0.000009] out_of_memory+0x1bd/0x4e0 |
9 | [ +0.000006] __alloc_pages_slowpath.constprop.0+0xbcc/0xc90 |
10 | [ +0.000003] __alloc_pages_nodemask+0x2de/0x310 |
11 | [ +0.000005] alloc_page_interleave+0x13/0x70 |
12 | [ +0.000004] pagecache_get_page+0x175/0x390 |
13 | [ +0.000002] filemap_fault+0x6a2/0x900 |
14 | [ +0.000006] ? xas_load+0x5/0x80 |
15 | [ +0.000049] ext4_filemap_fault+0x2d/0x50 [ext4] |
16 | [ +0.000004] __do_fault+0x34/0x170 |
17 | [ +0.000002] handle_mm_fault+0x124d/0x1c00 |
18 | [ +0.000007] do_user_addr_fault+0x1b8/0x400 |
19 | [ +0.000006] exc_page_fault+0x78/0x160 |
20 | [ +0.000007] ? asm_exc_page_fault+0x8/0x30 |
21 | [ +0.000002] asm_exc_page_fault+0x1e/0x30 |
22 | [ +0.000004] RIP: 0033:0x7f4173fe6228 |
23 | [ +0.000004] Code: 66 90 89 84 24 00 c0 fe ff 55 48 83 ec 30 44 8b 56 18 44 8b 46 1c 45 2b c2 41 83 f8 02 7c 2d 4c 8b 5e 10 0f b6 6e 2a 4d 63 c2 <43> 0f bf 04 03 41 83 c2 02 44 89 56 18 85 ed 75 34 0f c8 c1 f8 10 |
24 | [ +0.000001] RSP: 002b:00007f415befbba0 EFLAGS: 00010206 |
25 | [ +0.000003] RAX: 00000007c04b7fa0 RBX: 00007f40fa09b1a0 RCX: 000000000000003c |
26 | [ +0.000001] RDX: 00000000ffffffe0 RSI: 000000068c7955f0 RDI: 00007f414aa63caa |
27 | [ +0.000001] RBP: 0000000000000000 R08: 00000000001bea73 R09: 000000068c7955f0 |
28 | [ +0.000001] R10: 00000000001bea73 R11: 00007f3e6753a593 R12: 0000000000000000 |
29 | [ +0.000002] R13: 00000000ffffffe0 R14: 000000068c7955b8 R15: 00007f40ebe78000 |
30 | [ +0.000002] Mem-Info: |
31 | [ +0.000013] active_anon:8992390 inactive_anon:11254972 isolated_anon:0 |
32 | active_file:7441 inactive_file:648 isolated_file:0 |
33 | unevictable:10658092 dirty:0 writeback:52 |
34 | slab_reclaimable:639587 slab_unreclaimable:168059 |
35 | mapped:14226 shmem:42840 pagetables:792145 bounce:0 |
36 | free:189277 free_pcp:393 free_cma:0 |
37 | [ +0.000003] Node 0 active_anon:6454492kB inactive_anon:33721376kB active_file:11844kB inactive_file:2600kB unevictable:21380872kB isolated(anon):0kB isolated(file):0kB mapped:31048kB dirty:0kB writeback:124kB shmem:93424kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 12144640kB writeback_tmp:0kB kernel_stack:35136kB all_unreclaimable? no |
38 | [ +0.000003] Node 1 active_anon:29515068kB inactive_anon:11298512kB active_file:17920kB inactive_file:0kB unevictable:21251496kB isolated(anon):0kB isolated(file):0kB mapped:25856kB dirty:0kB writeback:84kB shmem:77936kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 9302016kB writeback_tmp:0kB kernel_stack:26880kB all_unreclaimable? no |
39 | [ +0.000003] Node 0 DMA free:11800kB min:8kB low:20kB high:32kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15980kB managed:15896kB mlocked:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB |
40 | [ +0.000003] lowmem_reserve[]: 0 1384 63846 63846 63846 |
41 | [ +0.000005] Node 0 DMA32 free:252856kB min:972kB low:2388kB high:3804kB reserved_highatomic:2048KB active_anon:11660kB inactive_anon:877124kB active_file:8kB inactive_file:0kB unevictable:197368kB writepending:0kB present:1519588kB managed:1454052kB mlocked:197368kB pagetables:1916kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB |
42 | [ +0.000003] lowmem_reserve[]: 0 0 62461 62461 62461 |
43 | [ +0.000005] Node 0 Normal free:197508kB min:240472kB low:304432kB high:368392kB reserved_highatomic:2048KB active_anon:6442832kB inactive_anon:32844252kB active_file:11840kB inactive_file:3112kB unevictable:21183504kB writepending:0kB present:65011712kB managed:63960996kB mlocked:21183504kB pagetables:1500212kB bounce:0kB free_pcp:792kB local_pcp:144kB free_cma:0kB |
44 | [ +0.000004] lowmem_reserve[]: 0 0 0 0 0 |
45 | [ +0.000008] Node 1 Normal free:294944kB min:45260kB low:111260kB high:177260kB reserved_highatomic:2048KB active_anon:29515068kB inactive_anon:11298512kB active_file:17920kB inactive_file:0kB unevictable:21251496kB writepending:36kB present:67108864kB managed:66007432kB mlocked:21251496kB pagetables:1666452kB bounce:0kB free_pcp:864kB local_pcp:60kB free_cma:0kB |
46 | [ +0.000004] lowmem_reserve[]: 0 0 0 0 0 |
47 | [ +0.000003] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11800kB |
48 | [ +0.000013] Node 0 DMA32: 1731*4kB (UMEH) 1777*8kB (UMEH) 669*16kB (UMEH) 642*32kB (UMEH) 372*64kB (UMEH) 187*128kB (UMEH) 121*256kB (UMEH) 70*512kB (UME) 84*1024kB (UM) 0*2048kB 0*4096kB = 252964kB |
49 | [ +0.000013] Node 0 Normal: 1249*4kB (UMEH) 11110*8kB (UMEH) 4343*16kB (UEH) 1287*32kB (UEH) 1*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 204612kB |
50 | [ +0.000007] Node 1 Normal: 6274*4kB (UMEH) 16185*8kB (UMEH) 6598*16kB (UEH) 1205*32kB (UEH) 3*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 298896kB |
51 | [ +0.000022] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB |
52 | [ +0.000001] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB |
53 | [ +0.000001] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB |
54 | [ +0.000001] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB |
55 | [ +0.000000] 65063 total pagecache pages |
56 | [ +0.000003] 3881 pages in swap cache |
57 | [ +0.000002] Swap cache stats: add 1024128, delete 1020326, find 528702/716598 |
58 | [ +0.000001] Free swap = 0kB |
59 | [ +0.000001] Total swap = 975868kB |
60 | [ +0.000001] 33414036 pages RAM |
61 | [ +0.000000] 0 pages HighMem/MovableOnly |
62 | [ +0.000001] 554442 pages reserved |
63 | [ +0.000001] 0 pages hwpoisoned |
64 | [ +0.000001] Tasks state (memory values in pages): |
65 | [ +0.000001] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name |
66 | [ +0.000058] [ 827] 0 827 20850 827 184320 120 -250 systemd-journal |
67 | [ +0.000004] [ 849] 0 849 5563 757 65536 76 -1000 systemd-udevd |
68 | [ +0.000005] [ 1074] 0 1074 842 205 45056 17 0 mdadm |
69 | [ +0.000004] [ 1116] 105 1116 22109 256 77824 2 0 systemd-timesyn |
70 | [ +0.000002] [ 1162] 0 1162 1496216 12353 888832 3477 0 cadvisor |
71 | [ +0.000002] [ 1169] 0 1169 1686 565 49152 1 0 cron |
72 | [ +0.000003] [ 1175] 104 1175 2098 496 53248 6 -900 dbus-daemon |
73 | [ +0.000003] [ 1189] 0 1189 21215 189 61440 237 0 ipmiseld |
74 | [ +0.000002] [ 1191] 106 1191 4651 776 73728 4 0 lldpd |
75 | [ +0.000003] [ 1194] 110 1194 43741 3522 102400 48 0 python3 |
76 | [ +0.000002] [ 1196] 106 1196 4651 241 69632 4 0 lldpd |
77 | [ +0.000003] [ 1197] 110 1197 955619 4184 528384 3604 0 prometheus-ipmi |
78 | [ +0.000002] [ 1200] 0 1200 1369 461 53248 1 0 rasdaemon |
79 | [ +0.000003] [ 1203] 0 1203 2953 759 65536 7 0 smartd |
80 | [ +0.000003] [ 1213] 0 1213 3634 561 69632 10 0 systemd-logind |
81 | [ +0.000003] [ 1218] 113 1218 1486 516 53248 1 0 ulogd |
82 | [ +0.000002] [ 1235] 0 1235 3338 674 65536 3 -1000 sshd |
83 | [ +0.000003] [ 1255] 0 1255 1461 391 53248 0 0 agetty |
84 | [ +0.000002] [ 1304] 0 1304 1369 504 49152 32 0 agetty |
85 | [ +0.000003] [ 1349] 109 1349 4650 301 77824 3 0 exim4 |
86 | [ +0.000002] [ 1352] 0 1352 3854 521 73728 7 0 systemd |
87 | [ +0.000002] [ 1355] 0 1355 41818 619 98304 178 0 (sd-pam) |
88 | [ +0.000003] [ 1479] 0 1479 1418533 7042 761856 100 0 confd |
89 | [ +0.000002] [ 2263] 499 2263 3855 542 65536 2 0 systemd |
90 | [ +0.000003] [ 2264] 499 2264 41818 598 102400 199 0 (sd-pam) |
91 | [ +0.000004] [ 215141] 11774 215141 3856 135 65536 221 0 systemd |
92 | [ +0.000002] [ 215143] 11774 215143 41856 332 102400 524 0 (sd-pam) |
93 | [ +0.000003] [1103991] 110 1103991 2432582 13081 1323008 3327 0 prometheus-node |
94 | [ +0.000005] [1523632] 114 1523632 4102 940 61440 184 0 python3 |
95 | [ +0.000003] [1523638] 114 1523638 836258 53113 1163264 380 0 envoy |
96 | [ +0.000003] [1523654] 0 1523654 867249 4036 487424 191 0 rsyslogd |
97 | [ +0.000002] [1527089] 0 1527089 2161 600 57344 3 -500 nrpe |
98 | [ +0.000027] [1530808] 498 1530808 1387 439 49152 104 0 firejail |
99 | [ +0.000003] [1530810] 498 1530810 1390 453 49152 116 0 firejail |
100 | [ +0.000004] [1530830] 498 1530830 230376 11076 1785856 1507 0 nodejs |
101 | [ +0.000003] [1535456] 115 1535456 187368136 5349014 1055092736 112166 0 java |
102 | [ +0.000003] [1947254] 115 1947254 165026957 4999859 743841792 32206 0 java |
103 | [ +0.000010] [2167712] 498 2167712 466424 245266 10825728 733 0 node |
104 | [ +0.000003] [2167728] 498 2167728 471943 256217 11067392 1133 0 node |
105 | [ +0.000002] [2167825] 498 2167825 466250 246813 11055104 498 0 node |
106 | [ +0.000002] [2167829] 498 2167829 491933 263580 11350016 7403 0 node |
107 | [ +0.000003] [2167845] 498 2167845 464195 246454 10776576 210 0 node |
108 | [ +0.000009] [2168000] 498 2168000 471585 248828 11059200 715 0 node |
109 | [ +0.000015] [2168010] 498 2168010 433672 213794 10686464 865 0 node |
110 | [ +0.000009] [2168022] 498 2168022 517528 287282 11493376 635 0 node |
111 | [ +0.000003] [2168131] 498 2168131 500395 283896 11231232 712 0 node |
112 | [ +0.000002] [2168143] 498 2168143 469079 248697 11071488 1471 0 node |
113 | [ +0.000002] [2168158] 498 2168158 507745 280501 11272192 233 0 node |
114 | [ +0.000002] [2168219] 498 2168219 472406 253401 11157504 984 0 node |
115 | [ +0.000006] [2168226] 498 2168226 507261 290541 11296768 1221 0 node |
116 | [ +0.000004] [2168243] 498 2168243 506479 276106 11288576 1691 0 node |
117 | [ +0.000006] [2168297] 498 2168297 487157 257163 11251712 1668 0 node |
118 | [ +0.000005] [2168350] 498 2168350 485062 260971 10866688 326 0 node |
119 | [ +0.000009] [2168380] 498 2168380 481668 256102 10903552 3101 0 node |
120 | [ +0.000009] [2169963] 115 2169963 149513155 4867441 767066112 15938 0 java |
121 | [ +0.000009] [2171928] 498 2171928 504180 272276 11350016 4197 0 node |
122 | [ +0.000010] [2177957] 498 2177957 477712 250209 10760192 2352 0 node |
123 | [ +0.000009] [2197191] 498 2197191 449542 229138 10465280 2455 0 node |
124 | [ +0.000011] [2201841] 498 2201841 513776 293486 11198464 1005 0 node |
125 | [ +0.000011] [2202814] 498 2202814 459962 239748 10698752 1618 0 node |
126 | [ +0.000011] [2637305] 498 2637305 482167 262073 10911744 296 0 node |
127 | [ +0.000009] [2637306] 498 2637306 466387 238854 10727424 2777 0 node |
128 | [ +0.000007] [2637325] 498 2637325 456391 233586 10719232 1151 0 node |
129 | [ +0.000003] [2637327] 498 2637327 475114 244669 10993664 440 0 node |
130 | [ +0.000003] [2637340] 498 2637340 464500 244244 10395648 1763 0 node |
131 | [ +0.000003] [2637358] 498 2637358 480203 255528 10993664 284 0 node |
132 | [ +0.000002] [2637368] 498 2637368 492600 266495 11018240 1419 0 node |
133 | [ +0.000003] [2637376] 498 2637376 722106 481947 20185088 1706 0 node |
134 | [ +0.000002] [2637386] 498 2637386 460792 238764 10747904 1087 0 node |
135 | [ +0.000003] [2637388] 498 2637388 510913 286403 11264000 722 0 node |
136 | [ +0.000002] [2637401] 498 2637401 485055 262631 11202560 2135 0 node |
137 | [ +0.000003] [2637418] 498 2637418 454320 241039 10993664 126 0 node |
138 | [ +0.000003] [2637423] 498 2637423 527441 304036 11612160 63 0 node |
139 | [ +0.000003] [2637449] 498 2637449 485716 268196 11272192 868 0 node |
140 | [ +0.000010] [2637486] 498 2637486 473869 257401 11182080 537 0 node |
141 | [ +0.000009] [2637507] 498 2637507 468617 243185 10973184 749 0 node |
142 | [ +0.000008] [2637549] 498 2637549 466817 239098 11083776 694 0 node |
143 | [ +0.000008] [2637592] 498 2637592 466898 246853 10907648 220 0 node |
144 | [ +0.000007] [2637605] 498 2637605 469777 245670 11014144 1515 0 node |
145 | [ +0.000009] [2637607] 498 2637607 473757 256467 11001856 285 0 node |
146 | [ +0.000004] [2854469] 498 2854469 492895 274333 11292672 70 0 node |
147 | [ +0.000003] [2904938] 498 2904938 479045 255341 11194368 244 0 node |
148 | [ +0.000005] [2988539] 498 2988539 440872 217748 10661888 540 0 node |
149 | [ +0.000007] [3042631] 498 3042631 464942 240412 10903552 218 0 node |
150 | [ +0.000009] [3066068] 498 3066068 435791 212620 10817536 352 0 node |
151 | [ +0.000003] [3093511] 498 3093511 469193 250839 10895360 968 0 node |
152 | [ +0.000009] [3100079] 498 3100079 452255 241082 10543104 309 0 node |
153 | [ +0.000007] [3103226] 498 3103226 423421 210322 10588160 154 0 node |
154 | [ +0.000009] [3276729] 498 3276729 436932 215825 10498048 357 0 node |
155 | [ +0.000008] [3311694] 498 3311694 430201 209912 10661888 88 0 node |
156 | [ +0.000009] [3343248] 498 3343248 451057 232825 10330112 33 0 node |
157 | [ +0.000004] [3365411] 498 3365411 438962 215973 10571776 17 0 node |
158 | [ +0.000003] [3365422] 498 3365422 443413 214794 10661888 6359 0 node |
159 | [ +0.000003] [3439458] 498 3439458 412234 197818 10629120 185 0 node |
160 | [ +0.000003] [3688560] 498 3688560 472065 258495 10797056 119 0 node |
161 | [ +0.000002] [3743542] 498 3743542 395875 179137 10059776 90 0 node |
162 | [ +0.000003] [3822927] 498 3822927 372708 153158 10166272 46 0 node |
163 | [ +0.000002] [3822959] 498 3822959 371737 150981 10051584 448 0 node |
164 | [ +0.000008] [3823017] 498 3823017 375957 157795 10375168 644 0 node |
165 | [ +0.000009] [3832928] 498 3832928 412391 193376 10780672 770 0 node |
166 | [ +0.000002] [3833199] 498 3833199 363063 143985 9474048 762 0 node |
167 | [ +0.000004] [3966752] 498 3966752 415068 194437 10231808 14 0 node |
168 | [ +0.000008] [4019421] 0 4019421 1156890 2906 606208 50 0 prometheus-rsys |
169 | [ +0.000040] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/system.slice/cassandra-a.service,task=java,pid=1535456,uid=115 |
170 | [ +0.002400] Out of memory: Killed process 1535456 (java) total-vm:749472544kB, anon-rss:21396056kB, file-rss:0kB, shmem-rss:0kB, UID:115 pgtables:1030364kB oom_score_adj:0 |
Something else of interest in the dmesg output are errors from the driver for the SAS controller. There are several instances of these spanning days in the full output (attached), but an example of these is below.
1 | [Dec12 23:10] mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k |
---|---|
2 | [ +0.318874] mpt3sas_cm0: _base_display_fwpkg_version: complete |
3 | [ +0.000008] mpt3sas_cm0: FW Package Ver(24.15.10.00) |
4 | [ +0.000750] mpt3sas_cm0: SAS3816: FWVersion(24.15.03.00), ChipRevision(0x00), BiosVersion(09.47.01.00) |
5 | [ +0.000002] NVMe |
6 | [ +0.000002] mpt3sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Diag Trace Buffer,Task Set Full,NCQ) |
7 | [ +0.000199] mpt3sas_cm0: Enable interrupt coalescing only for first 8 reply queues |
8 | [ +0.000115] mpt3sas_cm0: performance mode: balanced |
9 | [ +0.000016] mpt3sas_cm0: sending port enable !! |
10 | [ +9.422469] mpt3sas_cm0: port enable: SUCCESS |
11 | [ +0.000265] mpt3sas_cm0: search for end-devices: start |
12 | [ +0.000986] scsi target0:0:3: handle(0x0012), sas_addr(0x3f4fe0806260f508) |
13 | [ +0.000004] scsi target0:0:3: enclosure logical id(0x3f4ee08062092108), slot(8) |
14 | [ +0.000061] scsi target0:0:1: handle(0x0017), sas_addr(0x3f4ee0806260f50d) |
15 | [ +0.000003] scsi target0:0:1: enclosure logical id(0x3f4ee08062092108), slot(2) |
16 | [ +0.000061] scsi target0:0:2: handle(0x0018), sas_addr(0x3f4ee0806260f50e) |
17 | [ +0.000002] scsi target0:0:2: enclosure logical id(0x3f4ee08062092108), slot(1) |
18 | [ +0.000003] handle changed from(0x0019)!!! |
19 | [ +0.000061] scsi target0:0:0: handle(0x0019), sas_addr(0x3f4ee0806260f50f) |
20 | [ +0.000002] scsi target0:0:0: enclosure logical id(0x3f4ee08062092108), slot(0) |
21 | [ +0.000002] handle changed from(0x0018)!!! |
22 | [ +0.000062] mpt3sas_cm0: search for end-devices: complete |
23 | [ +0.000002] mpt3sas_cm0: search for end-devices: start |
24 | [ +0.000001] mpt3sas_cm0: search for PCIe end-devices: complete |
25 | [ +0.000002] mpt3sas_cm0: search for expanders: start |
26 | [ +0.000002] mpt3sas_cm0: search for expanders: complete |
27 | [ +0.000012] mpt3sas_cm0: mpt3sas_base_hard_reset_handler: SUCCESS |
28 | [ +0.000002] mpt3sas_cm0: _base_fault_reset_work: hard reset: success |
29 | [ +0.000013] mpt3sas_cm0: removing unresponding devices: start |
30 | [ +0.000005] mpt3sas_cm0: removing unresponding devices: end-devices |
31 | [ +0.000003] mpt3sas_cm0: Removing unresponding devices: pcie end-devices |
32 | [ +0.000003] mpt3sas_cm0: removing unresponding devices: expanders |
33 | [ +0.000002] mpt3sas_cm0: removing unresponding devices: complete |
34 | [ +0.000007] mpt3sas_cm0: scan devices: start |
35 | [ +0.000356] mpt3sas_cm0: scan devices: expanders start |
36 | [ +0.000066] mpt3sas_cm0: break from expander scan: ioc_status(0x0022), loginfo(0x310f0400) |
37 | [ +0.000003] mpt3sas_cm0: scan devices: expanders complete |
38 | [ +0.000003] mpt3sas_cm0: scan devices: end devices start |
39 | [ +0.001232] mpt3sas_cm0: break from end device scan: ioc_status(0x0022), loginfo(0x310f0400) |
40 | [ +0.000003] mpt3sas_cm0: scan devices: end devices complete |
41 | [ +0.000002] mpt3sas_cm0: scan devices: pcie end devices start |
42 | [ +0.000058] mpt3sas_cm0: break from pcie end device scan: ioc_status(0x0022), loginfo(0x310f0400) |
43 | [ +0.000002] mpt3sas_cm0: pcie devices: pcie end devices complete |
44 | [ +0.000002] mpt3sas_cm0: scan devices: complete |
45 | [ +0.122612] sd 0:0:1:0: Power-on or device reset occurred |
46 | [ +0.000040] sd 0:0:2:0: Power-on or device reset occurred |
47 | [ +0.000179] sd 0:0:0:0: Power-on or device reset occurred |
Update:
This has continued to happen —a total of seven nine times so far. Thus far it has only happened to the recently added Dell r450s (see: T352468), and it has only happened once per instance. Thus far it has only happened to the nodes in rack (row) b.
host | ooms | rack |
---|---|---|
restbase2028-a | ✔✔✔ | b |
restbase2028-b | ✔✔ | b |
restbase2028-c | ✔ | b |
restbase2029-a | ✔✔ | b |
restbase2029-b | ✔ | b |
restbase2029-c | ✔ | b |
restbase2030-a | ✔ | b |
restbase2030-b | b | |
restbase2030-c | ✔ | b |
restbase2031-a | c | |
restbase2031-b | c | |
restbase2031-c | c | |
restbase2032-a | c | |
restbase2032-b | c | |
restbase2032-c | c |