Page MenuHomePhabricator

WDQS embedding gives strange results.
Closed, ResolvedPublic

Description

The embedding of graphs from WDQS in Scholia has since around 14 May 2018 produced spurious results. The errors are not necessarily reproducible.

Often we now get "Server Error" with "Rate limit exceeded" or "Unable to display result" where we rarely before experienced these problems.

At other times there is a strange spurious result returned with id, label and description even though the SPARQL query should return something entirely different. We have seen the case for, e.g., https://tools.wmflabs.org/scholia/author/Q8219 in the panel "Number of pages per year". Sometimes a reload will fix the problem. It is as if we get another result from a wrong cache.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 15 2018, 10:17 AM

Probably a rate limit @Gehel .
You should only run the queries when they are visible for the user to avoid running multiple queries at the same time....

Restricted Application added projects: Wikidata, Discovery. · View Herald TranscriptMay 15 2018, 10:25 AM

@Fnielsen, we could add the JavaScript to run the queries in a way that it only runs when the <div> is visible... e.g. with something like this: https://github.com/shaunbowe/jquery.visibilityChanged

Probably a rate limit @Gehel .
You should only run the queries when they are visible for the user to avoid running multiple queries at the same time....

The "Rate limit exceeded" is definitely a rate limit issue. Making sure the requests are sent serially would certainly help. Just running them when the related <div/> is visible is probably good enough, and would most probably reduce the load on the WDQS servers.

Looking at the requests made from https://tools.wmflabs.org/scholia/author/Q8219, I see that at least some of the requests generating an HTTP 500 seem to have a syntax issue, leading to a java.lang.IllegalArgumentException: Non-aggregate variable in select expression: work. Those request should result in an HTTP 400, but that's probably non trivial to fix on our side. @Smalyshev might have an idea to fix the HTTP status, but the root issue seems to be the query.

"HTTP 500 seem to have a syntax issue". So I suppose it might be an issue my SPARQL? (I see that the work variable is not returned in one SELECT, but it seems strange to me that it is usually working...)

Gehel added a comment.May 17 2018, 1:37 PM

"HTTP 500 seem to have a syntax issue". So I suppose it might be an issue my SPARQL? (I see that the work variable is not returned in one SELECT, but it seems strange to me that it is usually working...)

Emphasis on the seem, I'm entirely unsure of my analysis here :) But there is something wrong, either on the query side, or in the way blazegraph processes that query.

So there are several issues at work here:

  • To get good suggestions for the query helper, we try to modify the query, and add the first variable bound in the query body to the SELECT clause. That doesn’t work for grouped queries, of course.
  • On embed.html, we try to get these suggestions to populate the tag cloud, even though there’s no tag cloud on embed.html.

I’ll upload a patch to solve the second issue. The first one is much older.

Smalyshev triaged this task as High priority.May 18 2018, 7:46 PM

Yeah the first one deserves its own bug but the second one definitely needs urgent fix.

Change 433828 had a related patch set uploaded (by Lucas Werkmeister; owner: Lucas Werkmeister):
[wikidata/query/gui@master] Don’t populate tag cloud if it doesn’t exist

https://gerrit.wikimedia.org/r/433828

Change 433828 merged by jenkins-bot:
[wikidata/query/gui@master] Don’t populate tag cloud if it doesn’t exist

https://gerrit.wikimedia.org/r/433828

Change 433832 had a related patch set uploaded (by WDQSGuiBuilder; owner: WDQSGuiBuilder):
[wikidata/query/gui-deploy@production] Merging from 1c88baa8c147546958492fa14eb7b5ad34479d4b:

https://gerrit.wikimedia.org/r/433832

Change 433832 merged by Smalyshev:
[wikidata/query/gui-deploy@production] Merging from 1c88baa8c147546958492fa14eb7b5ad34479d4b:

https://gerrit.wikimedia.org/r/433832

Fnielsen closed this task as Resolved.May 28 2018, 8:05 AM
Fnielsen claimed this task.

The patch for the tag cloud bug seems to also have fixed the "Rate limit exceeded" error. As the tag cloud bug may have called the WDQS the double number of times I suppose that this could explain why we no longer see the problem.

Thanks for fixing the bug.

Vvjjkkii renamed this task from WDQS embedding gives strange results. to 0xcaaaaaaa.Jul 1 2018, 1:10 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Fnielsen as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
CommunityTechBot renamed this task from 0xcaaaaaaa to WDQS embedding gives strange results. .Jul 2 2018, 6:53 AM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to Fnielsen.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added subscribers: gerritbot, Aklapper.