During a network partition last week, Phabricator's DB proxy failed over to a read-only replica. Even though the database was available in read-only mode, Phab was unavailable even for read-only operations like viewing a task:
Unhandled Exception ("AphrontQuery Exception") #1290: The MariaDB server is running with the --read-only option so it cannot execute this statement
After the proxy fails over, it has to be restored manually (this is by design, in order to prevent flapping) which means that Phabricator won't even partially self-heal from the network partition: it will be completely unavailable until fixed by human intervention. That in turn can make disaster recovery more difficult.
Better would be if read-only operations were possible in read-only mode.
(Previously: It looks like this was noticed and declined in the context of an eqiad-codfw switchover, T232883.)
After we have configured the database host to be m3-slave, configured the correct TCP port, had the DB grants fixed.. we noticed we can now talk to the DB on a "non-active host" but the Phabricator phd service still does not like to start when the DB is readonly.
We know this from:
sudo -u phd /srv/phab/phabricator/bin/phd start Freeing active task leases... [2022-11-17 19:20:29] EXCEPTION: (AphrontQueryException) #1290: The MariaDB server is running with the --read-only option so it cannot execute this statement at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:396] arcanist(), ava(), phabricator(), translations(), wmf-ext-misc() #0 AphrontBaseMySQLDatabaseConnection::throwQueryCodeException(integer, string) called at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:321] #1 AphrontBaseMySQLDatabaseConnection::throwQueryException(mysqli) called at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:217] #2 AphrontBaseMySQLDatabaseConnection::executeQuery(PhutilQueryString) called at [<phabricator>/src/infrastructure/storage/xsprintf/queryfx.php:8] #3 queryfx(AphrontMySQLiDatabaseConnection, string, string) called at [<phabricator>/src/applications/daemon/management/PhabricatorDaemonManagementWorkflow.php:522] #4 PhabricatorDaemonManagementWorkflow::freeActiveLeases() called at [<phabricator>/src/applications/daemon/management/PhabricatorDaemonManagementWorkflow.php:306] #5 PhabricatorDaemonManagementWorkflow::executeStartCommand(array) called at [<phabricator>/src/applications/daemon/management/PhabricatorDaemonManagementStartWorkflow.php:37] #6 PhabricatorDaemonManagementStartWorkflow::execute(PhutilArgumentParser) called at [<arcanist>/src/parser/argument/PhutilArgumentParser.php:492] #7 PhutilArgumentParser::parseWorkflowsFull(array) called at [<arcanist>/src/parser/argument/PhutilArgumentParser.php:377] #8 PhutilArgumentParser::parseWorkflows(array) called at [<phabricator>/scripts/daemon/manage_daemons.php:24]
As pointed out by @brennen there are the following upstream tickets related to making Phabricator work in RO mode.
https://secure.phabricator.com/T4571
https://secure.phabricator.com/T10769
We would like to have this and have a working RO phab in the inactive data center.