Page MenuHomePhabricator

MessageGroupStats caused database query issues
Closed, DeclinedPublicPRODUCTION ERROR


There seems to be database contention between 16:10 and midnight UTC on the 26th of june:

Several MessageGroupStats-related errors (such as Wikimedia\Rdbms\DatabaseMysqlBase::lock failed to acquire lock 'MessageGroupStats:updates' seems to be happening on meta and at that time. While in commonly used database rows that would point to an overload, on less commonly used functions, that would normally point to a bug.

Potentially related to T53410

Event Timeline

55 in from /rpc/RunJobs.php?wiki=mediawikiwiki&type=TranslationsUpdateJob&maxtime=30&maxmem=300M
~4k in metawiki from /rpc/RunJobs.php?wiki=metawiki&type=MessageGroupStatesUpdaterJob&maxtime=60&maxmem=300M
~14k in metawiki from /w/api.php
~100 in metawiki from /wiki/Learning_and_Evaluation/...

Sounds like these were caused by large translatable page moves, perhaps related to T168591: Complex page move leaves some translation-related pages behind.

Krinkle subscribed.

Zero hits in Logstash in at least 30 days for either of "failed to acquire lock" or "TranslationsUpdateJob".

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM