Page MenuHomePhabricator

If JobQueueEventBus fails to send a job exception is left uncaught
Closed, InvalidPublicPRODUCTION ERROR

Description

In the https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/497770 it was changed that if the job has failed to send via EventBus a JobQueueError is being triggered and bubbles up uncaught. This apparently can break other deferred updates according to the parent task causes other deferred updates to fail.

We need to catch the exception somewhere or revert back to just logging.

@aaron what was the inteded place for the JobQueueException to be caught?

Event Timeline

JobQueueException should be thrown from push(), with nothing catching it other than MWExceptionHandler or site-specific callers. Things like RenameUser *depend* on knowing whether something enqueued or not in order to function correctly. Typically, push() should be used pre-send, before preOutputCommit, so everything would just rollback anyway. Jobs pushed after than are enqueued during DeferrableUpdates (directly or indirectly via lazyPush()); in that case, DeferredUpdates should (already) catch any exceptions (not just job queue ones) and rollback on an update-by-update bases. The exceptions are logged in the DeferredUpdates channel (previously the Exception channel).

Are you saying there are exceptions not being caught with DeferredUpdates::run?

@aaron I've filed it because of the stack trace on the parent ticket and specifically T225199#5348638. From what you are describing the current behavior of the EventBusJobQueue is correct.

Tagging Perf-Team for aaron to follow up after he returns.

Per my comment above, this is the expected behavior.

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:05 PM