Page MenuHomePhabricator

MessageGroupStats caused database query issues
Closed, DeclinedPublic

Description

There seems to be database contention between 16:10 and midnight UTC on the 26th of june:

https://logstash.wikimedia.org/goto/3bcfe493d8d19739bca928d1b413d4e9

Several MessageGroupStats-related errors (such as Wikimedia\Rdbms\DatabaseMysqlBase::lock failed to acquire lock 'MessageGroupStats:updates' seems to be happening on meta and mediawiki.org at that time. While in commonly used database rows that would point to an overload, on less commonly used functions, that would normally point to a bug.

Potentially related to T53410

Event Timeline

jcrespo created this task.Jun 27 2017, 7:39 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 27 2017, 7:39 AM

55 in mediawiki.org from /rpc/RunJobs.php?wiki=mediawikiwiki&type=TranslationsUpdateJob&maxtime=30&maxmem=300M
~4k in metawiki from /rpc/RunJobs.php?wiki=metawiki&type=MessageGroupStatesUpdaterJob&maxtime=60&maxmem=300M
~14k in metawiki from /w/api.php
~100 in metawiki from /wiki/Learning_and_Evaluation/...

Sounds like these were caused by large translatable page moves, perhaps related to T168591: Complex page move leaves some translation-related pages behind.

Krinkle closed this task as Declined.Jul 25 2018, 7:39 PM
Krinkle added a subscriber: Krinkle.

Zero hits in Logstash in at least 30 days for either of "failed to acquire lock" or "TranslationsUpdateJob".

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM