The s4 master paged because of faulty memory (https://phabricator.wikimedia.org/T253808)
Looking at SEL we had a window of ~ 18 hours were we could have noticed this in advance and e.g. done a failover to a different server:
4 | May-27-2020 | 01:33:44 | Mem ECC Warning | Memory | transition to Critical from less severe 5 | May-27-2020 | 20:20:26 | ECC Uncorr Err | Memory | Uncorrectable memory error
We should options to monitor/alert for this (at least for critical systems like DB masters).