Page MenuHomePhabricator

Don’t send SPARQL prefixes with WikibaseQualityConstraints queries
Closed, ResolvedPublic

Description

As @Smalyshev points out in T204267#4581546, we’re sending a bunch of prefixes with each SPARQL query, even though they are probably already defined by default. We should try to avoid this, to reduce traffic to WDQS.

One complication is that we need to filter out statements with deprecated rank, and the IRIs of the “rank” predicate and the “Deprecated” node are different, depending on whether the query service is serving “munged” data (wikiba.se/ontology#) or not (wikiba.se/ontology-beta#). We currently do this by defining both wikibase and wikibase-beta prefixes explicitly, and removing deprecated statements using both prefixes, but that doesn’t work if we rely on the query service’ built-in notion of the wikibase prefix (which is always the non-beta version, regardless of whether the data went through the munger or not).

A simple solution, I suppose, would be to completely skip prefixes for REGEX queries, which are (I believe) the most common queries we send out and never need any prefixes.

Event Timeline

A simple solution, I suppose, would be to completely skip prefixes for REGEX queries, which are (I believe) the most common queries we send out and never need any prefixes.

This looks like a good start.

Change 471725 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Don’t send SPARQL prefixes for REGEX queries

https://gerrit.wikimedia.org/r/471725

Change 471725 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Don’t send SPARQL prefixes for REGEX queries

https://gerrit.wikimedia.org/r/471725

Lucas_Werkmeister_WMDE changed the task status from Open to Stalled.Nov 5 2018, 4:27 PM

Let’s wait for a few weeks with the rest of this – the wikibase/wikibase-beta ambiguity should be resolved soon (T112127).

Lucas_Werkmeister_WMDE changed the task status from Stalled to Open.Nov 12 2018, 5:53 PM

Indeed, so I think we can now go ahead with this, assuming any users who update WikibaseQualityConstraints also update Wikibase itself.

Change 473057 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Remove wikibase-beta prefix from SparqlHelper

https://gerrit.wikimedia.org/r/473057

Okay, I5c98bb3dd9 gets rid of wikibase-beta, which simplifies things a bit. After that, I think we have two options:

  • Introduce a flag whether the query service has the default prefixes built-in, and skip them if that flag is set to true. To stay on the safe side, it should probably default to false, so we’d also need a production config change to set it to true there.
  • Just assume the query service has the default prefixes built-in, and no user will ever use a different SPARQL server.

I would prefer the first option, but just today I realized that we’re currently acting more like the second option: for “has type” queries, we include query hints without even specifying the hint: prefixes, which is completely broken on non-BlazeGraph servers. So if we want to go for the first option, there should probably also be a “supports query hints” flag (or it could be merged with “supports default prefixes” into “is a Wikibase RDF Query server”).

Let’s go with the first option, but only with a single config flag. I’ll implement this now.

Change 473529 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Add option to control SPARQL server compatibility

https://gerrit.wikimedia.org/r/473529

Change 473057 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Remove wikibase-beta prefix from SparqlHelper

https://gerrit.wikimedia.org/r/473057

Change 473529 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Add option to control SPARQL server compatibility

https://gerrit.wikimedia.org/r/473529

Left to do: config change to set the new WBQualityConstraintsSparqlHasWikibaseSupport setting to true. But that’ll have to wait until the above changes are deployed, which will take two weeks (unless we backport them).

One note: There is no train this week.

Change 476267 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[operations/mediawiki-config@master] Don’t send SPARQL prefixes in WikibaseQualityConstraints

https://gerrit.wikimedia.org/r/476267

Change 476267 merged by jenkins-bot:
[operations/mediawiki-config@master] Don’t send SPARQL prefixes in WikibaseQualityConstraints

https://gerrit.wikimedia.org/r/476267

Mentioned in SAL (#wikimedia-operations) [2018-12-03T12:14:25Z] <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:476267| Don’t send SPARQL prefixes in WikibaseQualityConstraints (T204317)]] (duration: 00m 49s)

Lucas_Werkmeister_WMDE changed the task status from Stalled to Open.Dec 3 2018, 1:07 PM

Forgot to unstall this – should be deployed now, I’ll check Grafana in a few hours to see if I can see any trace of this change at all, but otherwise it should hopefully Just Work™.

I don’t think we’ll be able to see this in Grafana (Network traffic panel), query traffic from WBQC is probably not significant next to other query traffic and updater traffic. Closing.