Page MenuHomePhabricator

recentchanges deadlocks for dewiki (db1058)
Closed, ResolvedPublic

Description

Spike of transaction deadlocks appeared on dewiki today, involving dozens of these two query types fighting for locks:

DELETE /* RecentChangesUpdateJob::purgeExpiredRows  (jobrunner client IPs)
INSERT /* RecentChange::save  (all with bot user names, various MW client IPs)

Merlissimo noticed it:

[00:24:52] <Merlissimo>	 rc table for dewiki is missing updates for about twenty minutes. is this problem already known?
[00:25:45] <Krenair>	 in labs?
[00:25:53] <Merlissimo>	 in specialpages
[00:26:03] <Merlissimo>	 and in labs, too
[00:26:55] <TimStarling>	 https://de.wikipedia.org/wiki/Spezial:Letzte_%C3%84nderungen looks correct to me
[00:28:07] <Merlissimo>	 between 21:47 and 22:05
[00:28:40] <Merlissimo>	 e.g. no new pages on https://de.wikipedia.org/wiki/Spezial:Neue_Seiten in this time
[00:28:58] <Merlissimo>	 23:47-00:05 in german time
[00:29:47] <Merlissimo>	 e.g. https://de.wikipedia.org/wiki/Werbowe_%28Polohy%29 is missing there
[00:50:08] <TimStarling>	 Merlissimo: we're looking at it

Need to:

  • Resolve this particular gap
  • Reduce the potential for such clashes

Event Timeline

Springle created this task.Jul 3 2015, 1:15 AM
Springle raised the priority of this task from to Needs Triage.
Springle updated the task description. (Show Details)
Springle added subscribers: Springle, tstarling, Merl.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 3 2015, 1:15 AM
Springle set Security to None.Jul 3 2015, 1:17 AM
Springle added a subscriber: jcrespo.

Change 222626 had a related patch set uploaded (by Aaron Schulz):
Made recent changes purge jobs bail more aggressively

https://gerrit.wikimedia.org/r/222626

aaron added a comment.Jul 3 2015, 6:16 PM

The RC rebuild script in maintenance/ is too dumb and slow (and might loss other rows given all the extensions we have), requiring modification at the least.

The 18 minute gap is probably easier left alone as RC rotates out anyway and people tend to stop caring way before 30 days. Watchlists might matter for a few weeks, but new pages wouldn't be on those anyway. FlaggedRevs still makes listing the unpatrolled new articles easy in this case.

Change 222626 merged by jenkins-bot:
Made recent changes purge jobs bail more aggressively

https://gerrit.wikimedia.org/r/222626

Change 222626 merged by jenkins-bot:

What's left to do in this task? Should this still be open?

aaron closed this task as Resolved.Jul 26 2015, 9:02 AM
aaron claimed this task.