Page MenuHomePhabricator

Include Wikibase dispatch lag in API "maxlag" enforcing
Closed, ResolvedPublic

Description

Per Wikimedia-Hackathon-2018 discussion:

There should be a setting which maps the Wikidata dispatch lag to the API's max lag. This should be a set factor so that the max lag is determined by:

maxLag = max( maxLag, WikibaseDispatchLag / DispatchLagToMaxLagFactor ) where DispatchLagToMaxLagFactor is probably 5 (which would mean bots stop at around 250s of dispatch lag) or 6 (which would make bots stop at 300s… if we were to do this, we should also make sure the respective alert is only fired at maybe 330s of dispatch lag).

This is similar to $wgJobQueueIncludeInMaxLagFactor.

Note: This will need a hook in ApiMain::getMaxLag first. Alternatively Wikibase could overwrite ApiMain::checkMaxLag in its API subclasses… but that's probably the worse solution.

Event Timeline

Alternatively Wikibase could overwrite ApiMain::checkMaxLag in its API subclasses… but that's probably the worse solution.

To actually make that work you'd need to override the ApiMain object itself. That's probably a very bad idea.

Change 436543 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Introduce MaxLagInfo hook

https://gerrit.wikimedia.org/r/436543

Change 436561 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/extensions/Wikibase@master] Include Wikibase dispatch lag in API "maxlag" enforcing

https://gerrit.wikimedia.org/r/436561

Addshore subscribed.

@Ladsgroup will you continue this as part of you 20% campsite time? or would you rather someone else take this over?

I get this done in my 20% campsite time 💃

Change 436543 merged by jenkins-bot:
[mediawiki/core@master] Introduce ApiMaxLagInfo hook

https://gerrit.wikimedia.org/r/436543

I stand by what I said before in that exposing the dispatch lage is not as clear cut as exposing max DB lag.
It requires clients to make the decision, at what point is the dispatch lag too much? when should I stop?

I stand by what I said before in that exposing the dispatch lage is not as clear cut as exposing max DB lag.
It requires clients to make the decision, at what point is the dispatch lag too much? when should I stop?

The general advice is that clients should back off at maxlag=5. So Wikibase should determine some threshold of dispatch lag of when it wants clients to backoff, and then come up with a factor to scale it down to 5 "seconds". I don't think clients should make decisions here, aside from picking how many "seconds" (not really) to backoff at.

I stand by what I said before in that exposing the dispatch lage is not as clear cut as exposing max DB lag.
It requires clients to make the decision, at what point is the dispatch lag too much? when should I stop?

The general advice is that clients should back off at maxlag=5. So Wikibase should determine some threshold of dispatch lag of when it wants clients to backoff, and then come up with a factor to scale it down to 5 "seconds". I don't think clients should make decisions here, aside from picking how many "seconds" (not really) to backoff at.

Oh good, that's already been done in the patch :) I had this comment sitting here for a while but hadn't hit submit yet!

Change 436561 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Include Wikibase dispatch lag in API "maxlag" enforcing

https://gerrit.wikimedia.org/r/436561

How can this maxlag be retrieved by the API?

https://wikidata.beta.wmflabs.org/w/api.php?action=query&titles=MediaWiki&format=json&maxlag=-1 seems to just show the old max lag still

We need to enable it. I do it right now (on beta)

Change 441857 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[operations/mediawiki-config@master] labs: set dispatchLagToMaxLagFactor to 60

https://gerrit.wikimedia.org/r/441857

Change 441857 merged by jenkins-bot:
[operations/mediawiki-config@master] labs: set dispatchLagToMaxLagFactor to 60

https://gerrit.wikimedia.org/r/441857

^ This enables it on beta cluster. We need to do similar thing for prod once the patch gets deployed (Wednesday afternoon)

@Ladsgroup with the current implementation will a query with maxlag=-1 show both maxlag values, or just one?

The highest one only

In which case without any sort of dedicated api endpoint it will be impossible for T196868 to track only the new maxlag that we have added.
Instead the task will just have to track the maxlag of wikidata in general.

The highest one only

In which case without any sort of dedicated api endpoint it will be impossible for T196868 to track only the new maxlag that we have added.
Instead the task will just have to track the maxlag of wikidata in general.

Getting dispatch-related maxlag of wikidata is already done, You just get the value of median of dispatch lag and divide it by 60.

Excellent, that means my existing code should Just Work (tm).

Is the "lag" value in the API reply in seconds? maxlag does not appear to be documented for the query action...

Excellent, that means my existing code should Just Work (tm).

Is the "lag" value in the API reply in seconds? maxlag does not appear to be documented for the query action...

Yes it is in seconds. See https://www.mediawiki.org/wiki/Manual:Maxlag_parameter for more info

Vvjjkkii renamed this task from Include Wikibase dispatch lag in API "maxlag" enforcing to 1rcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii removed Ladsgroup as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: Ladsgroup; removed: gerritbot, Aklapper.
CommunityTechBot renamed this task from 1rcaaaaaaa to Include Wikibase dispatch lag in API "maxlag" enforcing.Jul 2 2018, 4:01 PM
CommunityTechBot assigned this task to Ladsgroup.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot edited subscribers, added: gerritbot, Aklapper; removed: Ladsgroup.

maxlag does not appear to be documented for the query action...

That's because it's a parameter to the "main" module, not the query action.

Excellent, that means my existing code should Just Work (tm).

I just checked some of the API logs and it looks like QuickStatementsBot isn't currently passing a maxlag?
It's currently the biggest api caller for wikidata that isn't passing it.

Small change: the feature will be activated on Thursday July 5th instead of today.

Excellent, that means my existing code should Just Work (tm).

I just checked some of the API logs and it looks like QuickStatementsBot isn't currently passing a maxlag?
It's currently the biggest api caller for wikidata that isn't passing it.

I should:
https://bitbucket.org/magnusmanske/magnustools/src/c76cc1a06cbd015c12046755a58608cb2aa496d1/public_html/php/oauth.php#lines-416

Unless I'm using it wrong... It does look for the error anyway:
https://bitbucket.org/magnusmanske/magnustools/src/c76cc1a06cbd015c12046755a58608cb2aa496d1/public_html/php/oauth.php#lines-516

Change 443939 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[operations/mediawiki-config@master] Set dispatchLagToMaxLagFactor to 60 for wikidata

https://gerrit.wikimedia.org/r/443939

Change 443939 merged by jenkins-bot:
[operations/mediawiki-config@master] Set dispatchLagToMaxLagFactor to 60 for wikidata

https://gerrit.wikimedia.org/r/443939

Mentioned in SAL (#wikimedia-operations) [2018-07-05T18:21:19Z] <thcipriani@deploy1001> Synchronized wmf-config/Wikibase-production.php: SWAT: [[gerrit:443939|Set dispatchLagToMaxLagFactor to 60 for wikidata]] T194950 (duration: 00m 51s)

Looks like this is all done, and it looks like yesterday for a couple of periods the dispatch lag was high and the maxlag as a result was higher.

image.png (297×1 px, 54 KB)

It looks like roughly ~6000 requests were stopped due to max lag out of ~200000 made yesterday

There are still some bots and tools that are not setting appropriate maxlag values for their requests but we can follow up elsewhere with that.

There are still some bots and tools that are not setting appropriate maxlag values for their requests but we can follow up elsewhere with that.

Please list them on https://www.wikidata.org/wiki/Wikidata:Administrators%27_noticeboard and the accounts will be blocked.