After the discussion at the developer summit (T149408) it’s been decided to create a Kafka-based backend for Mediawiki JobQueue
Simple overview of the JobQueue system.
The JobSpecification in the interface for the jobs. The job is a unit of work sent my MediaWiki for later backend processing. Each job has a type and parameters. The jobs are posted to the queue using the JobQueueGroup class, which maintains a mapping between job types and individual queues. Each job type has it’s own queue. The job type, parameters and other info is then serialised and sent to Redis.
On the consumer side there are JobRunners - they pop the jobs out of the queue, deserialise them and execute them.
There’s plenty of other featured built on top of this - individual job de-duplication, de-duplication based on the root job, delayed execution. The new system should match all of these features.
Outline of the new solution
In order to utilise Kafka as a medium for the job processing, we need to implement the producer side API of the JobQueue that would serialise a Job into a schema-ed JSON and send it to EventBus.
Event-Platform Value Stream will post event to a kafka topic. There will be a topic per the job type, so that different event types do not interfere.
ChangeProp would pick the events from those topics, do all the deduplication and management work, guarantee delivery using kafka commits. For the sake of backwoods compatibility the final destination would be the existing JobRunner.
The JobRunner would be converted to expose a run.php endpoint, where the jobs would be posted, json-deserialised and executed. The HTTP status codes would be used to communicate success/error of job execution for ChangeProp to manage retries.
The Wikimedia Developer Summit presentation regarding the proposal is available here.
This would be an umbrella task for all the things that need to happen before this is done.