Out of all the job types that are run in production we need to select candidates for being the first transferred to the new #eventbus infrastructure. Requirements:
- Low volume
- Idempotence - the job would initially be double-processed by old and new infra, so doing it twice shouldn't cause any trouble
- Preferably low importance - if something goes wrong it should be either easily fixable or possible to ignore
- As simple as possible - no delayed executions, root/leaf job splitting, no recursion and no importance for deduplication.
For reference here's the list of job types currently executed in production with some notes (integral list available as P5964):
I've looked through the following jobs (struck-through jobs have been moved):
- [[ https://github.com/wikimedia/mediawiki/blob/3588c0ac81701f617984178bb220eaad792982c3/includes/jobqueue/jobs/AssembleUploadChunksJob.php | AssembleUploadChunks ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-BounceHandler/blob/faa9c0e4c5c42a9cce9e892af36e080a6652f082/includes/BounceHandlerJob.php | BounceHandlerJob ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-BounceHandler/blob/840d622005fc4855d733b7b26c42d0a8100dc58e/includes/BounceHandlerNotificationJob.php | BounceHandlerNotificationJob ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki/blob/3f1a52805e3cf801eda0357ee236de6b49a31c85/includes/jobqueue/jobs/CategoryMembershipChangeJob.php | categoryMembershipChange ]] - not idempotent
- ~~[[ https://github.com/wikimedia/mediawiki/blob/3f1a52805e3cf801eda0357ee236de6b49a31c85/includes/jobqueue/jobs/CdnPurgeJob.php | cdnPurge ]] - uses delayed execution~~
- [[ https://github.com/wikimedia/mediawiki-extensions-CentralAuth/blob/master/includes/CentralAuthUtils.php | CentralAuthCreateLocalAccountJob ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-Wikidata/blob/1d1cc0ab6953bf44018179f2762d629293efbb34/extensions/Wikibase/client/includes/ChangeNotificationJob.php | ChangeNotification ]] - too high rate
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/CheckerJob.php | cirrusSearchCheckerJob ]] - basically idempotent. It verifies data in elasticsearch matches mysql, creates new jobs if they don't match. Uses delayed execution. Tricky. It runs from a cron script scheduling bulk jobs with a set of pageIds and uses delay 1,2,3,4... to scatter the jobs in time. Really this is abusing the delayed job functionality, and what it really needs is a job scheduler that can insert jobs in the future.
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/DeleteArchive.php | cirrusSearchDeleteArchive ]] - idempotent - checks database to verify archive indexing is still appropriate when run.
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/DeletePages.php | cirrusSearchDeletePages ]] - idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/ElasticaWrite.php | cirrusSearchElasticaWrite ]] - idempotent. Issued to retry failed write requests to elasticsearch. uses delayed execution
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/IncomingLinkCount.php | cirrusSearchIncomingLinkCount ]] - idempotent. expensive, high volume duplicates
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/LinksUpdate.php | cirrusSearchLinksUpdate ]] - idempotent, expensive
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/LinksUpdate.php | cirrusSearchLinksUpdatePrioritized ]] - idempotent, expensive,
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/MassIndex.php | cirrusSearchMassIndex ]] - idempotent, expensive, low volume
- [[ https://github.com/wikimedia/mediawiki-extensions-CirrusSearch/blob/master/includes/Job/OtherIndex.php | cirrusSearchOtherIndex ]] - cant use versioning, so out of order updates could be problematic
- [[ https://github.com/wikimedia/mediawiki-extensions-Cognate/blob/033830c5445f415dc2d344deafaa4c7a1730eb3e/src/CacheUpdateJob.php | CognateCacheUpdateJob ]] - basically a wrapper over HTMLCacheUpdatejob
- [[ https://github.com/wikimedia/mediawiki-extensions-Cognate/blob/033830c5445f415dc2d344deafaa4c7a1730eb3e/src/LocalJobSubmitJob.php | CognateLocalJobSubmitJob ]] - basically submits a job to a bunch of other sites
- [[ https://github.com/wikimedia/mediawiki-extensions-WikibaseQualityConstraints/blob/331d8ddfffd0de9def5354777ef14ebfa3d0c2ea/includes/UpdateConstraintsTableJob.php | constraintsTableUpdate ]] - some Wikidata job, not clear
- ~~[[ https://github.com/wikimedia/mediawiki/blob/aeedfb8526e9d221553e430437a7572a6da2ba65/includes/jobqueue/jobs/DeleteLinksJob.php | deleteLinks ]] a very good candidate, low volume (<1/s), idempotent ❤️~~
- [[ https://github.com/wikimedia/mediawiki-extensions-Echo/blob/master/includes/jobs/NotificationDeleteJob.php | EchoNotificationDeleteJob ]] - it's probably idempotent as it just reduces the number of notifications to a specified maximum, but it does unfold when it contains more the one userId
- [[ https://github.com/wikimedia/mediawiki-extensions-Echo/blob/95f83de22584a0c401359da9f0e5b320ca937f53/includes/jobs/NotificationJob.php | EchoNotificationJob ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki/blob/aeedfb8526e9d221553e430437a7572a6da2ba65/includes/jobqueue/jobs/EnotifNotifyJob.php | enotifNotify ]] - sends emails, definitely not idempotent
- ~~[[ https://github.com/wikimedia/mediawiki/blob/aeedfb8526e9d221553e430437a7572a6da2ba65/includes/jobqueue/jobs/EnqueueJob.php | enqueue ]] - enqueues other jobs, pretty important to begin with (removed by {T181216})~~
- ~~[[ https://github.com/wikimedia/mediawiki-extensions-FlaggedRevs/blob/48ddede699c91507ed2701f4e1b4a4c3ac847982/backend/FRExtraCacheUpdateJob.php | flaggedrevs_CacheUpdate ]] - idempotent, low volume, ❤️~~
- [[ https://github.com/wikimedia/mediawiki-extensions-GlobalUsage/blob/347daf2025299b31d30cee1adb42cc22f61422e2/includes/GlobalUsageCachePurgeJob.php | globalUsageCachePurge ]] - inserts HTMLCacheUpdate jobs for local wikis
- [[ https://github.com/wikimedia/mediawiki-extensions-GlobalUserPage/blob/0319dbb1322412a6775028c908f93299da7746e9/GlobalUserPageLocalJobSubmitJob.php | GlobalUserPageLocalJobSubmitJob ]] - just submits other jobs
- [[ https://github.com/wikimedia/mediawiki-extensions-GWToolset/blob/47163eb29fa7f00183a5ac6c11c06a1d573235a3/includes/Jobs/GWTFileBackendCleanupJob.php | gwtoolsetGWTFileBackendCleanupJob ]]
- [[ https://github.com/wikimedia/mediawiki-extensions-GWToolset/blob/5f72080659755914f0939098d450527c5766246e/includes/Jobs/UploadMetadataJob.php | gwtoolsetUploadMediafileJob ]]
- [[ https://github.com/wikimedia/mediawiki-extensions-GWToolset/blob/5f72080659755914f0939098d450527c5766246e/includes/Jobs/UploadMetadataJob.php | gwtoolsetUploadMetadataJob ]]
- ~~[[ https://github.com/wikimedia/mediawiki/blob/78ea2b7a4f40031b98356723c9f37cd6599fe454/includes/jobqueue/jobs/HTMLCacheUpdateJob.php | htmlCacheUpdate ]] - recursive~~
- [[ https://github.com/wikimedia/mediawiki-extensions-GlobalUserPage/blob/master/LocalGlobalUserPageCacheUpdateJob.php | LocalGlobalUserPageCacheUpdateJob ]] - idempotent, but enqueues other jobs and HTMLCacheUpdateJob
- [[ https://github.com/wikimedia/mediawiki-extensions-CentralAuth/blob/6abbbab31dfc8df146338b4ffcbd9b41d5788333/includes/LocalRenameJob/LocalPageMoveJob.php | LocalPageMoveJob ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-CentralAuth/blob/dbdd4055de463ca52dddd6994999644cf823a146/includes/LocalRenameJob/LocalRenameUserJob.php | LocalRenameUserJob ]] - not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-LoginNotify/blob/b29ecb79531e9d01b66cff6f7d61cd741262ae8b/includes/DeferredChecksJob.php | LoginNotifyChecks ]] - didn't quite understand what that does.
- [[ https://github.com/wikimedia/mediawiki-extensions-MassMessage/blob/master/includes/job/MassMessageJob.php | MassMessageJob ]] - sends a message to the user, obviously not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-MassMessage/blob/master/includes/job/MassMessageSubmitJob.php | MassMessageSubmitJob ]] - enqueues other jobs
- [[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/7bef0ed3ec963baf391bece71f51c11afe0160a3/utils/MessageGroupStatesUpdaterJob.php | MessageGroupStatesUpdaterJob ]] -
- ~~[[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/7bef0ed3ec963baf391bece71f51c11afe0160a3/utils/MessageIndexRebuildJob.php | MessageIndexRebuildJob ]] - rebuilds some indexes, so should be idempotent. ❤️~~
- [[ https://github.com/wikimedia/mediawiki-extensions-ORES/blob/0a76c7dd0c1c008e436d91f307a7a98e4e4d6d26/includes/FetchScoreJob.php | ORESFetchScoreJob ]] - should not be duplicated as uses ORES
- [[ https://github.com/wikimedia/mediawiki/blob/5300df4838f68437c17d5d697de57a46f0b5e02c/includes/jobqueue/jobs/PublishStashedFileJob.php | PublishStashedFile ]] - uploads files, shouldn't be duplicated
- [[ https://github.com/wikimedia/mediawiki/blob/33eabfd8d6a738ae3ed13e3f52c0bbd7664e581a/includes/jobqueue/jobs/RecentChangesUpdateJob.php | recentChangesUpdate ]] - too much traffic in this one
- ~~[[ https://github.com/wikimedia/mediawiki-extensions-Linter/blob/ee2f0efdcf6b3f1a74a8f1b84c864cb5fbc163e3/includes/RecordLintJob.php | RecordLintJob ]] - stores lint errors in DB, shouldn't be duplicated. ~~
- ~~[[ https://github.com/wikimedia/mediawiki/blob/70d1bc00919efb1cbfd00e85bbf65b8e947cbdb6/includes/jobqueue/jobs/RefreshLinksJob.php | refreshLinks ]] - too much traffic~~
- ~~[[ https://github.com/wikimedia/mediawiki/blob/70d1bc00919efb1cbfd00e85bbf65b8e947cbdb6/includes/jobqueue/jobs/RefreshLinksJob.php | refreshLinksPrioritized ]] - same as previous~~
- [[ https://github.com/wikimedia/mediawiki-extensions-Renameuser/blob/master/RenameUserJob.php | renameUser ]] - renames a user, obviously not idempotent
- [[ https://github.com/wikimedia/mediawiki/blob/34f0289491dcbc418bcd910a7dcdb79bbf4a87c5/includes/jobqueue/jobs/ThumbnailRenderJob.php | ThumbnailRender ]] - renders thumbnails, kinda idempotent, but duplicating will severely increase the load
- [[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/7bef0ed3ec963baf391bece71f51c11afe0160a3/tag/TranslatablePageMoveJob.php | TranslatablePageMoveJob ]] - moves pages, obviously not idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/7bef0ed3ec963baf391bece71f51c11afe0160a3/tag/TranslateDeleteJob.php | TranslateDeleteJob ]] - deletes stuff. Kinda idempotent
- [[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/7bef0ed3ec963baf391bece71f51c11afe0160a3/tag/TranslateRenderJob.php | TranslateRenderJob ]] - Job for updating translation pages when translation or template changes.
- [[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/7bef0ed3ec963baf391bece71f51c11afe0160a3/tag/TranslationsUpdateJob.php | TranslationsUpdateJob ]] - Job for updating translation units and translation pages when a translatable page is marked for translation.
- [[ https://github.com/wikimedia/mediawiki-extensions-Translate/blob/master/ttmserver/TTMServerMessageUpdateJob.php | TTMServerMessageUpdateJob ]] - This one retries itself and disables JobQueue retry service. **TODO** Need to add support for this possibility
- ~~[[ https://github.com/wikimedia/mediawiki-extensions-BetaFeatures/blob/6e7955f98a3d0515956152b42a7818df95cc7241/includes/UpdateBetaFeatureUserCountsJob.php | updateBetaFeaturesUserCounts ]] - just updates the user count, idempotent, low volume, ❤️~~
- [[ https://github.com/wikimedia/mediawiki-extensions-Wikidata/blob/master/extensions/Wikibase/client/includes/UpdateRepo/UpdateRepoOnDelete.php | UpdateRepoOnDelete ]] - Provides logic to update the repo after page deletes in the client.
- [[ https://github.com/wikimedia/mediawiki-extensions-Wikidata/blob/master/extensions/Wikibase/client/includes/UpdateRepo/UpdateRepoOnMove.php | UpdateRepoOnMove ]] - Provides logic to update the repo after page moves in the client.
- [[ https://github.com/wikimedia/mediawiki-extensions-TimedMediaHandler/blob/master/WebVideoTranscode/WebVideoTranscodeJob.php | webVideoTranscode ]] -
- [[ https://github.com/wikimedia/mediawiki-extensions-TimedMediaHandler/blob/master/WebVideoTranscode/WebVideoTranscodeJob.php | webVideoTranscodePrioritized ]] - shouldn't be duplicated as provides a lot of load
- ~~[[ https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/22c655666a651864674826be72c4f697d9281e6c/client/includes/Store/AddUsagesForPageJob.php | wikibase-addUsagesForPage ]] - not sure what it does~~
- [[ https://github.com/wikimedia/mediawiki-extensions-Wikibase/blob/15b01de658532686c03473b8d4dfd8b32aa1318a/client/includes/Changes/InjectRCRecordsJob.php | wikibase-InjectRCRecords ]] - not idempotent