While attempting failover during https://wikitech.wikimedia.org/w/index.php?title=Incident_documentation/20170119-Labstore - running nfs-manage up failed due to DRBD split brain protection refusing to bring up the failover node as primary, requiring a --force run of drbdadm primary all. Should --force be the default for nfs-manage up?
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
labstore: notes in nfs-manage for failover | operations/puppet | production | +4 -0 |
Event Timeline
Since we have our own cautious check for anything displaying Primary I think possibly we could prompt the user if we find that string to acknowledge the use of --force or to exit.
Change 442838 had a related patch set uploaded (by Rush; owner: cpettet):
[operations/puppet@production] labstore: notes in nfs-manage for failover
Change 442838 merged by Rush:
[operations/puppet@production] labstore: notes in nfs-manage for failover
I'm going to say "no" as far as this is concerned. If DRBD is split-brain, we should deal with it at the time according to which ever side we have deemed the "good side" rather than leaving the script to take extreme actions by default. There's more than one way to resolve such a situation, and sometimes it's invalidating the other side rather than forcing one side.