Page MenuHomePhabricator

Wikidata change propagation: introduce --batch-grace parameter to replace --dispatch-interval
Open, LowestPublic

Description

The dispatchChanges script currently has a --dispatch-interval setting which dictates how often a client wiki is allowed get a batch of changes. The maximum throughput is $batch-size per $dispatch-interval, independently of the number of dispatcher processes running.

The intent behind --dispatch-interval is to give batches some time to "fill up" - dispatching to the same wiki too often would lead to lots of small batches, leading to increased overhead and making client side optimizations (like change coalescing) less effective.

The solution could be a --batch-grace with the following semantics: if less than a full batch of changes is ready to be dispatched, only send them if we didn't send any changes to that client for at least $batchGrace seconds. If the batch is full, always send it.

This logic would still cover the intent of --dispatch-interval, but would avoid throttling the throughput while trying to catch up on a massive backlog of changes.

Event Timeline

thiemowmde added a subscriber: thiemowmde.

This is not assigned to a parent ticket, and neither is the vaguely linked T179006. Can you please fix this?

hoo removed hoo as the assignee of this task.Oct 3 2018, 10:40 PM
hoo added a subscriber: hoo.

@hoo are you working on this one?

No, I don't think this is important at the moment.

Addshore lowered the priority of this task from Low to Lowest.Oct 4 2018, 7:23 AM
Addshore moved this task from incoming to hold on the Wikidata board.
Addshore removed a project: User-Addshore.