Page MenuHomePhabricator

MovePage::move contention on cebwiki
Open, LowPublic

Description

From 2019-04-19T00:56:35 until 2019-04-22T03:59:33 (that is 3 full days) there was database contention while running MovePage::move on the query:

UPDATE  `categorylinks` SET cl_sortkey = 'KITCHENER PARK (PARKE SA NUZELAND)',cl_collation = 'uppercase',cl_type = 'page',cl_timestamp=cl_timestamp WHERE cl_from = '6304172' AND cl_to = 'Articles,_Parke'

with variations on sortkey and cl_from, causing >11K:

Deadlock found when trying to get lock; try restarting transaction

https://logstash.wikimedia.org/goto/c8fbb5713ea183d50dec02d5243152a4

This may be unavoidable given the many? api calls executed, but it caused database contention, so maybe it can be avoided/shows a place for potential optimizations (e.g. assume that if a page is edited, similar updates to pagelinks will happen so maybe they can be batched?), even if it is on the tooling used or the suppositions about the number of links/templates of a page.

Low priority because it is no longer happening, but it may be interesting to analyze and see if there is room for improvement.

Event Timeline

jcrespo created this task.Apr 23 2019, 8:38 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 23 2019, 8:38 AM
jcrespo updated the task description. (Show Details)Apr 23 2019, 8:44 AM
Anomie added a subscriber: Anomie.

I don't see anything in the API that seems like it would result in deadlocks here, so I'm going to move it to "non-core-API stuff" for now add MediaWiki-Special-pages to investigate it from that angle. The API module is pretty simple and just calls MovePage to do most of the work.

At first glance I didn't see anything in MovePage that would obviously cause a deadlock either, but I didn't track through it too closely.

I suggest to close this rather than keep it around if someone already checkit. No reason to keep a backlog if it is not clearly actionable, and we can reopen if it reoccurres. Normally I don't create a ticket for this kind of errors, but I was worried that it kept happening for days rather than hours of minutes.

The deadlocks came back, so I don't think we can close this for now. I still do not think it is high priority, but it is an ongoing event: https://logstash.wikimedia.org/goto/e5a2230fb3cd9c90155d5391fa54a484

Not sure if related, but now there seems to be contention on cebwiki for LinksUpdate::updateLinksTimestamp this one looks more like a structural problem, as it seems a lot of connections are trying to update the same page row at the same time:

Deadlock found when trying to get lock; try restarting transaction (10.64.48.25)	UPDATE  `page` SET page_links_updated = '20190514085248' WHERE page_id = '7934287'

Cebwiki is known for the huge usage of templates, so this could be an edge case that may be interesting for Performance-Team on page move or page update

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:07 PM
Restricted Application added a project: Core Platform Team. · View Herald TranscriptAug 28 2019, 11:07 PM
WDoranWMF triaged this task as Low priority.Sep 11 2019, 5:53 PM