Page MenuHomePhabricator

Updater misses updates when two updates happen very close to each other
Closed, ResolvedPublic

Description

Updater still misses some updates when two updates are done very close to each other, esp. by script. Probably because some servers see only one update in the RC stream and when they load the data it may be still the old one. It may be possible that this is because of caching - on the first update, the page is cached, and the second request gets the same version. We may want to add check that we do not fetch revision older than the update indicates.

Example:

this update:
https://www.wikidata.org/w/index.php?title=Q23753838&diff=499932595&oldid=499932539

is missing on wdq23, which has last revision as 499932539.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

For some reason, Tail Poller does not pick up the missing revision. Strange.

Would it make sense to push updates to the query service instead?
Writing to a JMS queue from PHP seems pretty easy in this tutorial.
I see a lot of advantages in using a dedicated Wikibase hook for propagating changes to a subscribing service.

@Jonas see T161731 where it is discussed. TLDR is: yes, it makes sense to have push updates, but it's not trivial and requires some pieces of infrastructure to be there which are not there yet (but will be soon).

Smalyshev changed the task status from Open to Stalled.Nov 1 2017, 12:06 AM
Smalyshev lowered the priority of this task from High to Medium.

Should be solved now with Kafka poller.