Ref: T172086 T188538
When people run queries like:
USE plwiki_p; SELECT rev_page, page_title, rev_timestamp FROM revision JOIN page ON rev_page = page_id ORDER by rev_timestamp; SELECT *;
(https://quarry.wmflabs.org/query/25152)
Things like
zhuyifei1999 | |
Feb 28 2018, 11:04 PM |
F27319063: graphite-labs.wikimedia.org_.png | |
Nov 27 2018, 6:46 PM |
F27319061: graphite-labs.wikimedia.org.png | |
Nov 27 2018, 6:46 PM |
F14125899: graphite-labs.wikimedia.org.png | |
Feb 28 2018, 11:04 PM |
Ref: T172086 T188538
When people run queries like:
USE plwiki_p; SELECT rev_page, page_title, rev_timestamp FROM revision JOIN page ON rev_page = page_id ORDER by rev_timestamp; SELECT *;
(https://quarry.wmflabs.org/query/25152)
Things like
https://www.mediawiki.org/wiki/Topic:U1le4hrq6eunlafz says 500k works. NO, that's waaay tooo large, you should be using dumps if you want so much unfiltered data. Gonna limit to 2^16 = 65536 rows per resultset.
Change 415489 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[analytics/quarry/web@master] worker: refuse to save if rowcount is > 65536
Mentioned in SAL (#wikimedia-cloud) [2018-03-01T00:37:23Z] <zhuyifei1999_> UPDATE query join query_revision on query.latest_rev_id = query_revision.id join query_run on latest_run_id = query_run.id SET status=1 where (status = 2 or status = 1) and query_run.timestamp <= DATE_ADD(NOW(), INTERVAL -1 HOUR); 251 rows affected (1.81 sec) T139162 T172086 T188564
Change 415489 merged by jenkins-bot:
[analytics/quarry/web@master] worker: refuse to save if rowcount is > 65536
I regularly run queries for ORES that return results in the range of 200k. These results are usually just a single numerical column and thus (probably) do not take up much space. These queries generally finish in less than 10 seconds.
(Alternative: time limit the save to 60 seconds, kill it with a signal handler for SIGALRM)
Change 419317 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[analytics/quarry/web@master] worker: change to SIGALRM-based limit instead of row
Change 419317 merged by jenkins-bot:
[analytics/quarry/web@master] worker: change to SIGALRM-based limit instead of row
Mentioned in SAL (#wikimedia-cloud) [2018-03-15T15:56:45Z] <zhuyifei1999_> deploying d653400 to quarry-runner-0{1,2} T188564
60 seconds is definitely not enough for the query. Switched to the unlimited version, then ran: https://quarry.wmflabs.org/query/25564
Browser reports a timestamp of Thu, 15 Mar 2018 16:31:50 GMT in date header for the POST.
$ while curl -s https://quarry.wmflabs.org/run/244898/status | grep -i running; do date; sleep 5; done {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:32:14 CDT 2018 {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:32:20 CDT 2018 {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:32:25 CDT 2018 {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:32:31 CDT 2018 {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:32:36 CDT 2018 [...] {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:36:15 CDT 2018 {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:36:20 CDT 2018 {"status": "running", "extra": {"connection_id": 12687695}} Thu Mar 15 11:36:26 CDT 2018
Took 5 mins. I'll double it to 10 mins.
Change 419786 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[analytics/quarry/web@master] worker: raise the save time limit to 10 mins
For the record, the above is crazy. It's writing less than a megabyte and it take 5 minutes! :| This works for now, but getting off of NFS (T178520) seems to be a better long-term solution.
Change 419786 merged by jenkins-bot:
[analytics/quarry/web@master] worker: raise the save time limit to 10 mins
Mentioned in SAL (#wikimedia-cloud) [2018-03-16T00:29:16Z] <zhuyifei1999_> deploying fc109c2 to both runners T188564
Mentioned in SAL (#wikimedia-cloud) [2018-03-24T03:21:27Z] <zhuyifei1999_> revert back to d9cc1c8 again on quarry-runner0{1,2} T188564 T190608
Change 427036 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[analytics/quarry/web@master] Revert recent changes regarding maximum time / size of query results
Change 427036 merged by jenkins-bot:
[analytics/quarry/web@master] Revert recent changes regarding maximum time / size of query results
@Framawiki We really should fix this.... https://quarry.wmflabs.org/query/30742 3 million rows... Executed in 9499.75 seconds. Any idea how?
This is getting ridiculously bad with queries like
https://quarry.wmflabs.org/query/31573
https://quarry.wmflabs.org/query/31564
These queries must die
Mentioned in SAL (#wikimedia-cloud) [2018-11-27T18:54:20Z] <zhuyifei1999_> triggered OOM killer on quarry-worker-02 9 times T188564
Yikes! I wonder if we could use the query optimization output to decide to not even start some queries. Does quarry use celery timeouts to kill queries? I've found those to be a bit unreliable in ORES so we use a signal based timeout instead. We could try something like that.
It currently only kills queries that mariadb knows that has been executing
on the database for longer than 30 mins, how long it takes to store the
query results is completely disregarded, until OOM killer kicks in.
The problem is, it is completely fine for the Quarry infrastructure to run
a query on the database for, say 15 mins, but it is not fine to use 15 mins
to store the results, which would cause a huge NFS IO lag for the duration,
any other queries that are in progress of being saved.
Yes, I guess using the optimization plan is indeed a good idea. If the join
type is ALL, without ‘Using Where’, and it takes longer than 10 secs to
execute, it is probably bad... except, how long does ‘SELECT page_id FROM
page LIMIT 1000000’ take to execute?
Unless we can find a similar system to nfs but lighter, we should change the workers' architecture (highly related to T178520: Find somewhere else (not NFS) to store Quarry's resultsets). Here are few ideas, probably stupid or already discarded but that I want to raise:
If they were not stored as sqlite files would the problem be partially resolved? I have trouble seeing the interest of sqlite if it's not for registration/access time, since data extraction in the app is done on whole tables without selection.
Is the saving time in sqlite longer than a simple recording in a csv file? It would make a difference if we compressed this csv data into memory before saving it, so less data to save to nfs?
The problem is that the recording at the end of the workers' task is too long/gourmand. And if we send the raw results to a redis server, and let these results be immediately available to the web server and a dedicated host take care of the registration "when he can + as fast as he can"?
One could almost imagine in the same way that via websocket or other similar technology (or even http resultset page available only once) workers would send the results directly to the end client as soon as they have them, and then store them quietly or send them to another server dedicated to registration using redis.
This could work. We might need to invent the wheel a way of storing the data type, compression, metadata, column names, etc. A benefit would be that it could easily fix T170464.
Is the saving time in sqlite longer than a simple recording in a csv file?
Dunno, haven't done the statistics.
It would make a difference if we compressed this csv data into memory before saving it, so less data to save to nfs?
You mean, save to a non-NFS mount (instance local storage or tmpfs) then copy the file over to NFS mount? That could work. I remember doing a git clone on a non-NFS + copy is faster than direct NFS git clone, on toolforge a while back.
The problem is that the recording at the end of the workers' task is too long/gourmand. And if we send the raw results to a redis server, and let these results be immediately available to the web server and a dedicated host take care of the registration "when he can + as fast as he can"?
You mean store the query results on redis? Redis is an in-memory storage, and given that the results are kept forever, this won't scale.
One could almost imagine in the same way that via websocket or other similar technology (or even http resultset page available only once) workers would send the results directly to the end client as soon as they have them, and then store them quietly or send them to another server dedicated to registration using redis.
Sounds like something. If the result set is large enough (above a certain threshold) the result will be streamed once to a web frontend. If the client is unavailable the result is immediately discarded. However:
What I had in mind:
The idea would be to have the results sent to two places asynchronously: first to the end customer, then when it is possible to record it permanently.
Sounds like something. If the result set is large enough (above a certain threshold) the result will be streamed once to a web frontend. If the client is unavailable the result is immediately discarded. However:
What is this 'certain threshold'? ORES wants to keep the results forever (T188564#4047735), but some people just do SELECT * FROM page for 'experimentation'.
The same procedure for all requests, with no differences, so no "certain threshold"?
How to allow downloading-data-in-a-different-format if the results are not saved? Send the data back from client to server? Move all the file generation code to JS and run on client? Or don't even show the data, just tell the client that the data is ready and give the download options?
Good catch. I see these possible implementations:
LGTM. We just need to make sure redis does not run out of memory. So, certain threshold needed.
The same procedure for all requests, with no differences, so no "certain threshold"?
We don't want to keep SELECT * FROM page forever do we?
- as you said move all the file generation code to JS and run on client (is it possible to handle big exports?)
Possible, but for big exports it can crash a browser. You can't make a 'generator' and write the data from the 'generator' into a file, you must construct all the data into a blob and then let the browser download the blob.
- ask the user the requested format before starting the querying task
With the current model, we don't know how large the results will be until it is executed, and I don't imagine people would like a extra step of specifying the format before executing it if they don't need to download the results, only display.
Though, I guess we could separate execute into two buttons, something like 'run and display' and 'run and download', but idk how to make the download-after-display menu make sense.
- only offer a preview first (json), and ask the user to wait a few more minutes to be able to export it in other formats
Hmm, so the results are saved anyhow?