Page MenuHomePhabricator

MediaWiki log spam during master unavailable / read only
Open, Needs TriagePublic

Description

[Action item from https://wikitech.wikimedia.org/wiki/Incident_documentation/20190923-s3_primary_db_master_crashed_-_s3_wikis_read-only ]

During temporary master unavailable / read only, mediawiki was spamming logs with both errors and warnings at a rate of about ~1.2k/s each (screenshot below). The channels affected were:

  • warnings from objectcache with Lowered set() TTL for mediawikiwiki:page-restrictions:v1:989195:3414200 due to replication lag. with changing keys
  • errors from dbreplication with Wikimedia\Rdbms\LoadBalancer::pickReaderIndex: all replica DBs lagged. Switch to read-only mode

I think particularly during exceptional conditions it'd be good to rate-limit (duplicated?) logs, not sure if there's a simple way to do that in php/mediawiki though. IMHO a rate limit when producing on a given channel+level would be ideal.