Page MenuHomePhabricator

purgeParserCache.php: Cannot purge this kind of parser cache
Open, MediumPublic

Description

The purgeParserCache.php maintenance script is failing intermittently, with output that looks like this: P10981

The error at the end is

Apr 14 20:43:25 mwmaint1002 mediawiki_job_parser_cache_purging[210524]: Cannot purge this kind of parser cache.

followed by exit status 1. This happens on some runs, but not all; the rest finish successfully.

I've been converting maintenance cronjobs to systemd timers, which means among other things we actually get notified when they fail -- so it's entirely possible this has been happening for a while now.

Source: maintenance/purgeParserCache.php#75

		$pc = MediaWikiServices::getInstance()->getParserCache()->getCacheStorage();
		$success = $pc->deleteObjectsExpiringBefore( $date, [ $this, 'showProgressAndWait' ] );
		if ( !$success ) {
			$this->fatalError( "\nCannot purge this kind of parser cache." );
		}

Event Timeline

The error message incorrect/deceptive. It was written to account for BagOStuff implementations that lack an deleteObjectsExpiringBefore implementation, thus returning false from the method's base class stub. This would eg. been seen by developers locally or third-parties if they set ParserCache to something that already has its own expiry mechanism (not Sql-based).

However, the current SqlBagOStuff::deleteObjectsExpiringBefore also returns false if it any of the db connection or db write/delete operations failed.

SqlBagOStuff.php
	public function deleteObjectsExpiringBefore() {
		

		$ok = true;

		$keysDeletedCount = 0;
		foreach ( $shardIndexes as $numServersDone => $shardIndex ) {
			$db = null; // …
			try {
				$db = $this->getConnection( $shardIndex );
				$this->deleteServerObjectsExpiringBefore(
					
				);
			} catch ( DBError $e ) {
				$this->handleWriteError( $e, $db, $shardIndex );
				$ok = false;
			}
		}

		return $ok;
	}

So next steps:

  • What are the errors encountered by this script? They should be in Logstash, to be correlated by hostname and rough timestamp, and then confirmed by cli_argv to confirm its from purgeParserCache.php. Probably one of the exception/error/DBConnection/DBQuery channels.
  • Are they tolerable?
    • If so, update deleteObjectsExpiringBefore to not return false for those cases.
    • Do we need to monitor this by other means instead? (e.g something less boolean)
  • Are they not tolerable, and e.g. warrant some kind of fix?
    • Yay for having been alerted to it, and let's fix it :)
fgiunchedi triaged this task as Medium priority.Apr 15 2020, 8:44 AM
Aklapper added a subscriber: AMooney.

@AMooney: Assuming that "Set projects" was accidentally used instead of "Add projects", hence restoring some previous project tags.