Split from T216484
If master DB goes down, the read only message says it is waiting for replica to catch up.
This is misleading, the error message should say something like the master DB is down.
Split from T216484
If master DB goes down, the read only message says it is waiting for replica to catch up.
This is misleading, the error message should say something like the master DB is down.
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
rdbms: simplify LoadBalancer::getLaggedReplicaMode() | mediawiki/core | master | +4 -11 |
The error comes from the following code in rdbms/LoadBalancer.php
public function getReadOnlyReason( $domain = false, IDatabase $conn = null ) { if ( $this->readOnlyReason !== false ) { return $this->readOnlyReason; } elseif ( $this->getLaggedReplicaMode( $domain ) ) { if ( $this->allReplicasDownMode ) { return 'The database has been automatically locked ' . 'until the replica database servers become available'; } else { return 'The database has been automatically locked ' . 'while the replica database servers catch up to the master.'; }
This code and the laggedReplicaMode() method appear to work as intended.
I suspect that maybe earlier on in the code, it might be unable to compare something between the master and replica. If in that comparison, the bottom value is interpreted as newer, then that means it will look like the master is far ahead of the replica, and thus lead to this error.
For the general case of a master being down for maintenance, the code is known to behave correctly. However, this case it was temporarily unavailable in an unexpected way. I'm classifying this as low priority for now as it is only an error message. The logical behaviour of the code is as expected, which is that we automatically enable "read-only" mode until the master and its replication are back up.
Re-tagging on them main workboard for @aaron to review when he's back. I recall this area being refactored in the last two weeks, possibly resolving the issue reported here. Or, if not, being fresh in mind and perhaps easy to fix.
Change 529189 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] rdbms: simplify LoadBalancer::getLaggedReplicaMode()
Change 529189 merged by jenkins-bot:
[mediawiki/core@master] rdbms: simplify LoadBalancer::getLaggedReplicaMode()
Change 583350 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] rdbms: Fix unprocessed "{host}" in LoadMonitor replag message