Wikidata constraint check is getting throttled from wdqs-internal more than usual
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	dcausse
	Dec 21 2022, 11:14 AM

Description

Checking the telemetry metrics for jobrunner -> wdqs-internal we found weird patterns in error rates.

Checking more closely it appears that wdqs-internal is serving more requests (type fallback ones) and thus throttling more of them:

(c.f. https://grafana-rw.wikimedia.org/d/000000344/wikidata-quality?orgId=1&refresh=30s)

The system is reacting as it is told to do but should we adapt the service to this new behavior if it persists?
Are there ways to measure the actual user-impact of these errors?

AC:

determine if some actions need to be taken
configure the system to support this load if yes, decline the task otherwise

Event Timeline

dcausse created this task.Dec 21 2022, 11:14 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 21 2022, 11:14 AM

dcausse renamed this task from Wikidata constraint check is getting throttled from wdsq-internal more than usual to Wikidata constraint check is getting throttled from wdqs-internal more than usual.Dec 21 2022, 11:15 AM

Lucas_Werkmeister_WMDE added a project: Wikidata.Dec 21 2022, 11:27 AM

I think the user impact of this will be that constraint checks using SPARQL (“subject type”, “value type”, “distinct values”) won’t always work, and users will not be shown some violations of those constraints even when they should be; but if I’m reading DelegatingConstraintChecker::getCheckResultsFor() correctly, these errors won’t abort the whole constraint check, and other constraint violations should still be shown if I’m not mistaken.

I'm wondering if more requests are coming because one or more impactful constraints have been added. Anyone got a hunch? Worth doing the detective work?

@Lydia_Pintscher we're waiting on you to tell us how important / urgent this is.

Gehel moved this task from Incoming to Watching / Waiting on the Wikidata-Query-Service board.Jan 9 2023, 4:17 PM

bking subscribed.Jan 9 2023, 4:17 PM

To me it looks like SPARQL type checks generally went back to normal around January 4th (Grafana permalink):

In T325730#8509750, @Gehel wrote:

@Lydia_Pintscher we're waiting on you to tell us how important / urgent this is.

Generally a part of the constraint checks not working is bad for Wikidata because editors don't get shown notifications about issues in the Item they are looking at.
But as Lucas said things seems to be looking ok again.

@dcausse I think we can close this if the metrics look good from your side too? (I don’t know what I’m looking for in the Envoy Telemetry Grafana dashboard.)

Everything looks fine from my end! closing :)

	F36077062: image.png
	Jan 9 2023, 4:22 PM

Wikidata constraint check is getting throttled from wdqs-internal more than usualClosed, DeclinedPublicActions

Description

Event Timeline

Wikidata constraint check is getting throttled from wdqs-internal more than usual
Closed, DeclinedPublic
Actions