batches of entities to be consumed by the queryservice updater are marked as done as soon as they are sent to the updater. In the event the updater fails, times out, falls off the face of the earth etc. they will not be retried.
We suggest that perhaps after replying to requests from the updater for a batch the batch is not marked done but instead marked "pending".
We may then mark pending batches older than 300s as totally undone and ready to go again
Finally we add logic to the updater that marks batches a "really done" when it has had a positive HTTP response code from the queryservice.
We must also track a "failed count" and mark batches that fail 3 times as permanently failed and stop shipping them so that they don't block future batches. At this time we would like to see an error reported using the error reporting system.