Page MenuHomePhabricator

Cognate does some updates synchronously, and others via JobQueue. That may lead to inconsistencies in the DB
Open, MediumPublic

Event Timeline

Krinkle renamed this task from Cognate does some updates synchronously, and others via the JQ. That may lead to inconsistencies in the DB to Cognate does some updates synchronously, and others via JobQueue. That may lead to inconsistencies in the DB.Jul 6 2017, 8:40 PM

The order of execution in the job queue is inherently undefined - so even if all updates were done via jobs, we may run into inconsistencies.

In general, MediaWiki follows the approach of eventual consistency. because of that, I don't see the merit of this ticket: the fact that the database may be inconsistent while update jobs are in limbo is not a bug, it's a "feature" (or rather, deliberate trade-off). I also don't think this can be fixed within the framework that MediaWiki provides.

If we have concrete problems with specific inconsistencies, we can try to mitigate the effect in that specific case. I don't think there is a general solution for this.

Firstly, looking at the hooks it seems that only actions taken on page deletion have the chance of ending up in the job queue.

As for the inconsistency, if a page does get into an inconsistent state it can be quite hard to rectify without poking a maint script or the DB,.

An example situation could be...

  1. Page A is created, and synchronously added to the db tables
  2. Page A is deleted and a job is added to remove it from the db tables
  3. Page A is un-deleted and synchronously added to the db tables (ignore as it is already there)
  4. Job from step 2 executes and entry is removed the from db tables.

This would result in a page missing language links / being linked to.
As entries are only ever added to or removed from the db on move, delete or create PageA in the above case would never have a chance to be re added to the db / the db to reach consistency again.

Possible options

1 ---- Use jobs for everything and deduplicate

So, I believe with the job queue, you can de-duplicate jobs based on a set of criteria.
By doing this we could always delete previous pending jobs for a given page and only execute the final / latest db query.

2 ---- Synchronously do everything

Would probably not add any extra execution time for users really.

3 ---- Also update the db on page edit

This would mean more calls to the db, but in cases as described above the db would automatically get back to a consistent state.

Addshore triaged this task as Medium priority.Aug 31 2017, 10:45 AM