Page MenuHomePhabricator

Database error "Duplicate entry" for PRIMARY key (from EchoNotificationMapper::insert)
Open, Needs TriagePublic

Description

Error

Request ID: XHQeDQpAMDgAAFUtII0AAADO

message
Error: 1062 Duplicate entry '####-####' for key 'PRIMARY' (10.64.32.136)
Function: EchoNotificationMapper::insert
Query: INSERT  INTO `echo_notification` (notification_event,notification_user,notification_timestamp,notification_read_timestamp,notification_bundle_base,notification_bundle_hash,notification_bundle_display_hash) VALUES (###,####,'2019#',NULL,'#','#############','###############')
trace
#3 /srv/mediawiki/php-1.33.0-wmf.18/extensions/Echo/includes/mapper/NotificationMapper.php(46): Wikimedia\Rdbms\Database->insert(string, array, string)
#4 /srv/mediawiki/php-1.33.0-wmf.18/includes/libs/rdbms/database/Database.php(3806): Closure$EchoNotificationMapper::insert(Wikimedia\Rdbms\DatabaseMysqli, string)
#5 /srv/mediawiki/php-1.33.0-wmf.18/includes/deferred/AtomicSectionUpdate.php(35): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure$EchoNotificationMapper::insert;2098)
#6 /srv/mediawiki/php-1.33.0-wmf.18/includes/deferred/DeferredUpdates.php(270): AtomicSectionUpdate->doUpdate()
#7 /srv/mediawiki/php-1.33.0-wmf.18/includes/deferred/DeferredUpdates.php(216): DeferredUpdates::runUpdate(AtomicSectionUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
#8 /srv/mediawiki/php-1.33.0-wmf.18/includes/deferred/DeferredUpdates.php(140): DeferredUpdates::execute(array, string, integer)
#9 /srv/mediawiki/php-1.33.0-wmf.18/extensions/EventBus/includes/JobExecutor.php(90): DeferredUpdates::doUpdates()
#10 /srv/mediawiki/rpc/RunSingleJob.php(77): JobExecutor->execute(array)

Impact

The impact is limited to Echo itself, in that the problem is during a deferred update which is has its execution isolated from the job itself and from other deferred updates.

Impact on Echo itself when this particular update does not happen is: (unknown.

Notes

About 11,000 job queue executions encountered this exception in the past 7 days.

Started shortly after promotion of the 1.33-wmf.18 branch.

Event Timeline

Krinkle created this task.Feb 25 2019, 7:21 PM
Restricted Application added a project: Growth-Team. · View Herald TranscriptFeb 25 2019, 7:21 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Krinkle updated the task description. (Show Details)Feb 25 2019, 7:21 PM

Still seen. 17,000 exceptions matching this pattern in the last 7 days (job runners).

I noticed that the spikes tend to share the same request ID, e.g. XIGyfgpAAEEAADvI50oAAAAA ran around 2019-03-08T00:17 and produced 7000 exceptions over the course of 14 seconds. Most of them look like this:

EchoNotificationMapper::insert	Duplicate entry '######-#####' for key 'PRIMARY'  .. ()	INSERT  INTO `echo_notification` (notification_event,notification_user,notification_timestamp,notification_read_timestamp,notification_bundle_base,notification_bundle_hash,notification_bundle_display_hash) VALUES ( ... )

While I guess this is technically not an error, as the primary key is doing what it should be doing (preventing duplicate entries), I would like to mention another spike of EchoNotificationMapper::insert on Mediawiki.org of 2000+ log entries at 07:46 today: https://logstash.wikimedia.org/goto/e98df3a0833591ae969d850d8bfee91f Probably worth looking at to see if there is an underlying issue with concurrency.

	-- Key to event_id
	notification_event int unsigned not null,
	-- Key to user_id
	notification_user int unsigned not null,

PRIMARY KEY (notification_user, notification_event)

So when the event is already assigned to the user, than it seems okay to ignore the insert.
The question is, how it is possible to end up with many jobs trying to insert these row.
Maybe there are all inserting the event 0 or user 0 when the echo event is not created correctly or the id is not transfered after insert.

Krinkle updated the task description. (Show Details)Jun 25 2019, 2:56 PM

The spacing on this is kind of weird (query):

Maybe that offers some clues.

We can revisit this in Q2; if someone wants to investigate and propose a patch before then we (Growth-Team) can make some time to review.

Restricted Application added a project: Analytics. · View Herald TranscriptJul 16 2019, 9:30 PM
kostajh moved this task from Inbox to Q2 2019-20 on the Growth-Team board.Jul 16 2019, 9:30 PM

Doesn't appear in the last 30 days in Logstash.

Milimetric moved this task from Incoming to Radar on the Analytics board.Jul 18 2019, 4:37 PM
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:07 PM
Krinkle removed a project: Analytics.