Page MenuHomePhabricator

enable on Async Queries via Celery on https://superset.wmcloud.org/
Closed, ResolvedPublic

Assigned To
None
Authored By
Stuartyeates
Jun 28 2023, 8:57 AM
Referenced Files
Restricted File
Aug 11 2023, 8:44 PM
F37121905: image.png
Jun 28 2023, 8:57 AM

Description

Currently long-running queries that run on Quarry such as https://quarry.wmcloud.org/query/74483 fail on superset with the message "DB engine Error 'query_id' This may be triggered by: Issue 1002 - The database returned an unexpected error. " (see attached screenshot)

I believe this can be fixed by enabling Async Queries via Celery as per https://superset.apache.org/docs/installation/async-queries-celery/ "On large analytic databases, it’s common to run queries that execute for minutes or hours. To enable support for long running queries that execute beyond the typical web request’s timeout (30-60 seconds), it is necessary to configure an asynchronous backend for Superset ..."

This also enables a "results backend that defines where the worker will persist the query results" which is a feature that has been asked for in T169452.

image.png (199×468 px, 13 KB)

Event Timeline

Some notes of varying degrees of being on topic. So far as caching goes, it isn't what many people are thinking when they think caching. One cannot really share the results of a previously run query. The cache is largely tied to the sqllab tab that ran the query. One could share the download link and anyone with access to the data could download the same to csv until the cache expires, but that is all. The intention, as I understand it, is that caching is just for managing the tab that is running a query.

As for async queries, I get the feeling (through lots of tinkering with caching) that it is enabled, so perhaps we need to up the cache timeout.

This seems to be resolved by using the postgres db in k8s rather than the trove db. I say seems to, because I can run a select sleep(700) when using the internal db, and get the error you describe when trying to use the external db. In the logs "lost connection to mysql" is mentioned when trying to use the external.

Looks like it runs your query in the testing environment:
{F37398759}

There are plans to move back to the db in k8s T342699, T343527 and T343526 which appears to resolve this issue as well.