**Current state: **The "dispatch" process for wikidata / wikibase is kicked off from a cron job and maintenance script.
**Problem**
Wikibase users (including WIkibase developers) do not want to have to run extra maintenance scripts.
WMF SREs do not want to manage extra cron jobs (they cause complexity during in cross data centre work)
**Docs**
[The Repository dispatching (script, db tables, jobs)](https://doc.wikimedia.org/Wikibase/master/php/md_docs_topics_change-propagation.html)
[The Client receiving its notification event / job](https://wmde.github.io/wikidata-wikibase-architecture/systems/WikibaseClient/06-Runtime_View.html#entity-change-notifications)
Rough Idea:**Rough Idea:**
- Every edit schedules a DispatchTriggerJob that is totally generic.
- The job holds no info at all, so all DispatchTriggerJob of this kind are the same. This means that new jobs get ignored if there is already an older job waiting for execution etc
* Every edit schedules a delayed DispatchTriggerJob. - We may want to consider having a configurable way to schedule less of these than 1 per edit, That job is completely generic and holds no info at all,1 per 100 on Wikidata production for example would likely be just fine. so DispatchTriggerJob of this kind are the same. This means that new jobs get ignored if there is already an older job waiting for execution.[Examples in core](https://codesearch.wmcloud.org/core/?q=mt_rand%7ChasExpiredRow&i=nope&files=changes%7Cjournal%7Cwatcheditem%7CUserGroupMan&excludeFiles=&repos=)
* (option a) DispatchTriggerJob would poll the changes table, as we do now, and dispatch any pending changes to the most lagged wiki(s). This means that passes for long tail wikis will often end up doing nothing. If "doing nothing" is quick enough, we could simply go and look at the next wiki- DispatchTriggerJob looks for wikis that meet our dispatch criteria( [using the wb_changes_dispatch table](https://doc.wikimedia.org/Wikibase/master/php/md_docs_sql_wb_changes_dispatch.html) as in the current maintenance script, until some minimum number of changes has been processedregarding max interval etc) and that are not locked, or some maximum time has been exceeded.scheduling 1 DispatchClientJob per wiki
* (option b) DispatchTriggerJob would take the next batch of changes, and send notifications for all of them to the interested wikis.- DispatchClientJob would perform a "pass" for the wiki, That meanas that each pass has to (potentially) push to all wikise existing maintenance script does, which may take quite longthen unlocking the client wiki.
* If there are still pending jobs or wikis to service, DispatchTriggerJob schedules another (delayed?) DispatchTriggerJob before it exists. How many new triggeres should be scheduled? We need to avoid starvation- Everything from this point on would remain the same
For sites that only have a single client site (such as a local client setup) we could consider directly scheduling DispatchClientJobs, but also prevent explosive growthskipping out the in-direction of the number of tDispatchTrigger jobsJob.
**Acceptance criteria🏕️🌟:**
[] No maintenance script / cron job needs to be run for the dispatching process to work
[] De-duplication should be used to avoid creating too many unnecessary jobswhere possible and needed
[] Documentation should be updated (in Wikibase.git & architecture docs)
**Notes:**
When this task is tackled it should be taken in mind that some refactoring will likely make sense, such as {T256208} (but this is also tracked and prioritized separately.
This should be gradually deployed, and this could possibly be done in a couple of different ways:
- Per environment: beta, test, production
- Per client wiki (or group of wikis) within each environment: group1, group2, (everything except enwiki), enwiki, commonswiki
Overall performance of these jobs will be dictated by the job queue processing, which is controlled by WMF SREs and service ops?
We have a general performance requirement of "The dispatching process for Wikidata should not be slower than it currently is"
The code to be deployed from this ticket likely won't have a big impact in performance, though the configuration of the processing of jobs may, and this would need to be figured out with serviceops.
In Wikidata production this cron jobs can be seen at https://github.com/wikimedia/puppet/blob/e1e13a59de3021afaa43c31745abbe348a93017d/modules/profile/manifests/mediawiki/maintenance/wikidata.pp