Page MenuHomePhabricator

api-testing: avoid processing the entire job queue when changing user rights
Open, Needs TriagePublic

Description

In T389863#10671903, I investigated an instance where a test timed out because api-testing runs the entire job queue when changing user rights, but processing the queue takes a while because flow enqueues a bunch of jobs (T389894) and CirrusSearch keeps re-enqueuing them because they fail (T389895). Quoting what I wrote over there:

Reading the linked task, T230211, it actually does explain why the call was added. However, the task is mostly focused on replicated environments, but CI (like most developer setups) do not use replication. On top of that, it's not fully clear to me what the relationship is between waiting for replication and emptying the job queue. Is it "process the jobqueue just to wait some time and eventually replication will have caught up"? I assume it must be something like that, because the account creation + user right change sequence does not depend on or fire any jobs, as far as I can tell. If that is the case, it would be nice if api-testing had a config flag to specify whether it should wait for replication; and we could set this to false in CI.

Given that:

  • There has been no progress in T230211 since 2019
  • Most developers probably have environments without replication
  • CI itself does not use replication
  • It's unclear what the relationship is between emptying the jobqueue and changing user rights
  • Emptying the jobqueue does more harm than good because it can lead to timeouts

Could we either drop the runAllJobs call entirely, or gate it behind a config setting for api-testing (e.g., replicated_environment: true)?

(In passing: noting that the api-testing library has no active phab project. The page on mw.org links to Core Platform Team Initiatives (API Integration Tests), which still has associated tasks despite having been archived in 2023. Tagging MediaWiki-Engineering in the hope of reaching the right people.)