Page MenuHomePhabricator

Investigate how to track the number of queries that return a result vs an error over time
Closed, ResolvedPublic

Description

We want to know how many of the queries executed by the query builder:

  • return a result
  • return an error
  • timeout

Challenge is that we cannot access the results in the iframe from within Query-Builder. -> We need to investigate if we can filter out the QueryBuilder results on their referrer-header, maybe in Hadoop?

Event Timeline

So you can get some data out of hadoop. Such query works:

SELECT
  *
FROM
  wmf.webrequest
WHERE
  year = 2020
  AND month = 11
  AND day = 27
  AND referer like 'https://query-builder-test.toolforge.org%'
  AND uri_host = "query.wikidata.org" LIMIT 50;

but embed.html emits 200 for timeout errors (but at least gives 400 for bad queries). Meaning we can get one and two from hadoop but not the third one.

Another option (which seems easier) is to make WDQS GUI emit extra stats when referrer is query-builder which you can access it using document.referrer. It's much easier to implement and less resource intensive to extract. I think that's the way we should go.

Another option (which seems easier) is to make WDQS GUI emit extra stats when referrer is query-builder which you can access it using document.referrer. It's much easier to implement and less resource intensive to extract. I think that's the way we should go.

Yes, I agree. Also, it already sends out tracking data, so we should be able to just add more tracking there:

image.png (157×890 px, 36 KB)