In order to avoid issues like we've had with Updater getting stale data (T210901: Stale reads for WDQS Updater) we may want to enable using ChronologyProtector functionality for RDF exports consumed by the Updater.
According to advice by @aaron this is what we can do:
<AaronSchulz> so, in preOutputCommit(), the main DB commit happens, deferred updates run, CP positions are saved, then post-send deferred updates. I suppose if the code that enqueues to kafka put the ChronologyProtector::getClientId() value in the message, and made sure to enqueue post-send, then the updater could relay that client ID as a header for the RDF HTTP request. <AaronSchulz> so the updater would want to grab values from kafka (themselves from MW) to use for the ChronologyClientId HTTP header to Special:EntityData
This would require a patch to:
- Code that generates Kafka change events for Wikidata, to add Chronology ID to Kafka data
- Code in Updater that sends requests, to add ChronologyClientId header