Page MenuHomePhabricator

Implement support for ChronologyProtection in events sent when editing Mediawiki/Wikidata
Closed, ResolvedPublic

Description

In order to avoid issues like we've had with Updater getting stale data (T210901: Stale reads for WDQS Updater) we may want to enable using ChronologyProtector functionality for RDF exports consumed by the Updater.

According to advice by @aaron this is what we can do:

<AaronSchulz>	so, in preOutputCommit(), the main DB commit happens, deferred updates run, CP positions are saved, then post-send deferred updates. I suppose if the code that enqueues to kafka put the ChronologyProtector::getClientId() value in the message, and made sure to enqueue post-send, then the updater could relay that client ID as a header for the RDF HTTP request.
<AaronSchulz>	so the updater would want to grab values from kafka (themselves from MW) to use for the ChronologyClientId HTTP header to Special:EntityData

This would require a patch to:

  1. Code that generates Kafka change events for Wikidata, to add Chronology ID to Kafka data
  2. Code in Updater that sends requests, to add ChronologyClientId header

Details

Related Gerrit Patches:
wikidata/query/rdf : masterAdd Chronology Id support for change events
mediawiki/extensions/EventBus : masterAdd chronology_id field to events.
mediawiki/event-schemas : masterAdd chronology_id field
mediawiki/core : masterAccept new "MediaWiki-ChronologyClientId" HTTP header
mediawiki/core : masterrdbms: add getChronologyProtectorClientId() to ILBFactory

Event Timeline

Restricted Application added a project: Wikidata. · View Herald TranscriptDec 22 2018, 1:29 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Smalyshev moved this task from Backlog to Next on the User-Smalyshev board.Feb 27 2019, 7:21 PM

@aaron I have started looking into it, and I am not sure how one can get an instance of ChronologyProtector in the code that needs it. The only place I can see that uses ChronologyProtector is LBFactory class, but getChronologyProtector() is protected there, so it's impossible to get it from outside, and there are no other code anywhere that work with it. What would you advise to do in this case? How can one get ChronologyProtector::getClientId()?

aaron added a comment.Apr 16 2019, 4:56 AM

I think you can just add a method, similar to LBFactory::getChronologyProtectorTouched, that exposes the client ID, maybe call it LBFactory::getChronologyProtectorClientId();

Smalyshev moved this task from In review to Doing on the User-Smalyshev board.

Change 504815 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/core@master] Add getChronologyProtectorClientId() to LBFactory

https://gerrit.wikimedia.org/r/504815

Change 504819 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/EventBus@master] Add chronology_id field to event

https://gerrit.wikimedia.org/r/504819

Change 504826 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/event-schemas@master] Add chronology_id field

https://gerrit.wikimedia.org/r/504826

Hm, can we add some more documentation, especially to the field in the event schema? I'm not familiar with ChronologyProtector, and googling around for docs isn't yielding much.

This is probably the request for @aaron, he knows the most about it.

Krinkle added a subscriber: Krinkle.

Another question for @aaron as well - If we start storing this chronologyprotector field in kafka etc. that means it can survive longer and no longer has an expiry like we do for cookies. I vaguely recall an issue in the past where clients that kept the value too long caused subsequent requests to stall for many seconds as MW mistook the unknown value for a value that hasn't appeared yet.

It may've been mitigated already, but if not, we'll need to sort that out before this goes live.

@aaron Is there any docs how to use the client ID on the client side? I see there's support for cpPosIndex cookie which is $index@$time#$clientId but if I have only client ID that is not going to help me. So I am not sure how would one use it - I can't find any trace of ChronologyClientId HTTP header.

aaron added a comment.Apr 19 2019, 9:38 PM

Another question for @aaron as well - If we start storing this chronologyprotector field in kafka etc. that means it can survive longer and no longer has an expiry like we do for cookies. I vaguely recall an issue in the past where clients that kept the value too long caused subsequent requests to stall for many seconds as MW mistook the unknown value for a value that hasn't appeared yet.
It may've been mitigated already, but if not, we'll need to sort that out before this goes live.

That was cpPosIndex for cookies, which was changed to include timestamp/clientId so that old values are ignored server-side. Use of client IDs alone will never cause wait delay on the position store itself, which is fine for this sort of EventBus use case; the DB changes and kafka enqueue happen in the master DC and the jobs run in the master DC as well, so there is no need to wait on cross-DC position store replication. By not doing so, it also avoids any risk of weird cases causing wait timeout delay.

aaron added a comment.Apr 19 2019, 9:40 PM

@aaron Is there any docs how to use the client ID on the client side? I see there's support for cpPosIndex cookie which is $index@$time#$clientId but if I have only client ID that is not going to help me. So I am not sure how would one use it - I can't find any trace of ChronologyClientId HTTP header.

Right, I'll patch Setup.php to use the header if cpPosIndex does not already contain it via the cookie. I thought it did this already, but apparently not (probably since nothing uses that pattern yet).

Change 505328 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] Use "ChronologyClientId" HTTP header when the ID is not already in "cpPosIndex"

https://gerrit.wikimedia.org/r/505328

Change 504815 merged by jenkins-bot:
[mediawiki/core@master] rdbms: add getChronologyProtectorClientId() to ILBFactory

https://gerrit.wikimedia.org/r/504815

Change 505328 merged by jenkins-bot:
[mediawiki/core@master] Accept new "MediaWiki-ChronologyClientId" HTTP header

https://gerrit.wikimedia.org/r/505328

Gilles moved this task from Inbox to Radar on the Performance-Team board.Apr 23 2019, 7:45 PM
Gilles edited projects, added Performance-Team (Radar); removed Performance-Team.
Smalyshev renamed this task from Implement support for ChronologyProtection in RDF export to Implement support for ChronologyProtection in events sent when editing Mediawiki/Wikidata.Apr 24 2019, 6:56 AM
elukey added a subscriber: elukey.EditedMay 17 2019, 9:49 AM

Hello everybody, I am investigating https://phabricator.wikimedia.org/T223310, namely the one of the Redis MainStash being constantly hammered by GET to global:Wikimedia\Rdbms\ChronologyProtector since after the deployment of 1.34.0-wmf.3. Is it related to anything done in this task? I thought it was https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/EventBus/+/504819/ but it is still not merged..

Change 504826 merged by Ottomata:
[mediawiki/event-schemas@master] Add chronology_id field

https://gerrit.wikimedia.org/r/504826

Change 504819 merged by Ppchelko:
[mediawiki/extensions/EventBus@master] Add chronology_id field to events.

https://gerrit.wikimedia.org/r/504819

Krinkle removed a subscriber: Krinkle.Jul 18 2019, 12:18 AM

Change 524349 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[wikidata/query/rdf@master] Add Chronology Id support for change events

https://gerrit.wikimedia.org/r/524349

Change 524349 merged by jenkins-bot:
[wikidata/query/rdf@master] Add Chronology Id support for change events

https://gerrit.wikimedia.org/r/524349

Smalyshev moved this task from Next to Done on the User-Smalyshev board.Jul 29 2019, 7:51 PM
Smalyshev closed this task as Resolved.Jul 30 2019, 11:31 PM