I just depooled tools-sgeexec-0907, it was alerting for grid errors + puppet staleness, kernel logs look like its filesystem is corrupted:
[10980873.951531] EXT4-fs (dm-0): error count since last fsck: 3087 [10980873.951568] EXT4-fs (dm-0): initial error at time 1620058872: ext4_lookup:1623: inode 2 [10980873.951573] EXT4-fs (dm-0): last error at time 1627557661: ext4_lookup:1623: inode 2 [11023343.448022] Process accounting resumed [11073147.677663] EXT4-fs (dm-0): error count since last fsck: 3087 [11073147.677703] EXT4-fs (dm-0): initial error at time 1620058872: ext4_lookup:1623: inode 2 [11073147.677708] EXT4-fs (dm-0): last error at time 1627557661: ext4_lookup:1623: inode 2 [11109742.533438] Process accounting resumed [11165421.407818] EXT4-fs (dm-0): error count since last fsck: 3087 [11165421.407822] EXT4-fs (dm-0): initial error at time 1620058872: ext4_lookup:1623: inode 2 [11165421.407824] EXT4-fs (dm-0): last error at time 1627557661: ext4_lookup:1623: inode 2 [11189905.826660] EXT4-fs (dm-0): Delayed block allocation failed for inode 790288 at logical offset 131072 with max blocks 2048 with error 117 [11189905.840635] EXT4-fs (dm-0): This should not happen!! Data will be lost [11191417.887712] EXT4-fs (dm-0): Delayed block allocation failed for inode 278920 at logical offset 32768 with max blocks 2048 with error 117 [11191417.900359] EXT4-fs (dm-0): This should not happen!! Data will be lost [11196141.580946] Process accounting resumed [11250050.060176] EXT4-fs (dm-0): Delayed block allocation failed for inode 280138 at logical offset 32768 with max blocks 2048 with error 117 [11250050.064456] EXT4-fs (dm-0): This should not happen!! Data will be lost [11250050.067979] Aborting journal on device dm-0-8. [11250050.115162] EXT4-fs error: 425 callbacks suppressed [11250050.115182] EXT4-fs error (device dm-0) in ext4_dx_add_entry:2355: Journal has aborted [11250050.671391] EXT4-fs error (device dm-0): ext4_journal_check_start:61: Detected aborted journal [11250050.676588] EXT4-fs (dm-0): Remounting filesystem read-only [11250050.682315] EXT4-fs error (device dm-0) in ext4_evict_inode:273: Journal has aborted [11257695.133816] EXT4-fs (dm-0): error count since last fsck: 3091 [11257695.133851] EXT4-fs (dm-0): initial error at time 1620058872: ext4_lookup:1623: inode 2 [11257695.133867] EXT4-fs (dm-0): last error at time 1631309016: ext4_evict_inode:273: inode 2