Page MenuHomePhabricator

recentchanges deadlocks for dewiki (db1058)
Closed, ResolvedPublic


Spike of transaction deadlocks appeared on dewiki today, involving dozens of these two query types fighting for locks:

DELETE /* RecentChangesUpdateJob::purgeExpiredRows  (jobrunner client IPs)
INSERT /* RecentChange::save  (all with bot user names, various MW client IPs)

Merlissimo noticed it:

[00:24:52] <Merlissimo>	 rc table for dewiki is missing updates for about twenty minutes. is this problem already known?
[00:25:45] <Krenair>	 in labs?
[00:25:53] <Merlissimo>	 in specialpages
[00:26:03] <Merlissimo>	 and in labs, too
[00:26:55] <TimStarling> looks correct to me
[00:28:07] <Merlissimo>	 between 21:47 and 22:05
[00:28:40] <Merlissimo>	 e.g. no new pages on in this time
[00:28:58] <Merlissimo>	 23:47-00:05 in german time
[00:29:47] <Merlissimo>	 e.g. is missing there
[00:50:08] <TimStarling>	 Merlissimo: we're looking at it

Need to:

  • Resolve this particular gap
  • Reduce the potential for such clashes

Event Timeline

Springle raised the priority of this task from to Needs Triage.
Springle updated the task description. (Show Details)
Springle added subscribers: Springle, tstarling, Merl.

Change 222626 had a related patch set uploaded (by Aaron Schulz):
Made recent changes purge jobs bail more aggressively

The RC rebuild script in maintenance/ is too dumb and slow (and might loss other rows given all the extensions we have), requiring modification at the least.

The 18 minute gap is probably easier left alone as RC rotates out anyway and people tend to stop caring way before 30 days. Watchlists might matter for a few weeks, but new pages wouldn't be on those anyway. FlaggedRevs still makes listing the unpatrolled new articles easy in this case.

Change 222626 merged by jenkins-bot:
Made recent changes purge jobs bail more aggressively

Change 222626 merged by jenkins-bot:

What's left to do in this task? Should this still be open?

aaron claimed this task.