Page MenuHomePhabricator

Could not enqueue jobs: "Unable to deliver all events: 500: Internal Server Error"
Closed, DuplicatePublic

Description

Error

MediaWiki version: 1.35.0-wmf.1

message
Could not enqueue jobs: Unable to deliver all events: 500: Internal Server Error

Impact

Notes

Details

Request ID
69c779afb828e20acac52fda
Request URL
/rpc/RunSingleJob.php
Stack Trace
exception.trace
#0 /includes/jobqueue/JobQueue.php(350): JobQueueEventBus->doBatchPush(array, integer)
#1 /includes/jobqueue/JobQueue.php(320): JobQueue->batchPush(array, integer)
#2 /includes/jobqueue/JobQueueGroup.php(161): JobQueue->push(array)
#3 /includes/deferred/JobQueueEnqueueUpdate.php(58): JobQueueGroup->push(array)
#4 /includes/deferred/DeferredUpdates.php(383): JobQueueEnqueueUpdate->doUpdate()
#5 /includes/deferred/DeferredUpdates.php(281): DeferredUpdates::attemptUpdate(JobQueueEnqueueUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#6 /includes/deferred/DeferredUpdates.php(226): DeferredUpdates::run(JobQueueEnqueueUpdate, Wikimedia\Rdbms\LBFactoryMulti, Monolog\Logger, BufferingStatsdDataFactory, string)
#7 /includes/deferred/DeferredUpdates.php(149): DeferredUpdates::handleUpdateQueue(array, string, integer)
#8 /extensions/EventBus/includes/JobExecutor.php(96): DeferredUpdates::doUpdates()
#9 /srv/mediawiki/rpc/RunSingleJob.php(76): JobExecutor->execute(array)
#10 {main}

Event Timeline

Krinkle created this task.Oct 13 2019, 12:40 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 13 2019, 12:40 AM

Not frequent, but I've never seen this kind of error before. Where is the HTTP 500 response coming from? Worth investigating as short of an overall job queue outage, it should always be possible to at least queue jobs.

Still seen. Caused loss of some jobs from frwiki today. Impact unknown as we don't know which jobs were lost.

daniel triaged this task as High priority.Nov 11 2019, 8:08 PM
Joe added a subscriber: Joe.Mar 30 2020, 10:44 AM

Not frequent, but I've never seen this kind of error before. Where is the HTTP 500 response coming from? Worth investigating as short of an overall job queue outage, it should always be possible to at least queue jobs.

The 500 response is coming from eventgate-main. I still see those repeating relatively often even today.

We have multiple classes of errors that are due to issues calling eventgate from MW. This specific one, I would assume, is due to PayloadTooLarge error. The solution for this particular one is conceptually very simple - don't try to send batches that are too large. There's a task for that T232392 - I'm going to close this one as a duplicate.