Page MenuHomePhabricator

Request for SQL Templating to be enabled in Superset
Closed, ResolvedPublic

Description

It would be very useful to have SQL Templating enabled in Superset, as it would allow for faster chart generation in Presto-based dashboards and allow us to share personalized links to dashboards set to specific values of filters.

Event Timeline

Suppose we have a dashboard with a chart powered by Presto. For simplicity, imagine transformations have been taken care of. The underlying query is then:

SELECT * FROM tbl

Because Presto needs to query all of the data and then any subsetting (filtering, specific date range, etc.) has to happen after the fact on the retrieved results. This can lead to timeouts and slow performance in general.

From https://preset.io/blog/intro-jinja-templating-apache-superset/:

For example, to add a time range to a virtual dataset, you can write the following:

SELECT * 
FROM tbl 
WHERE dttm_col > '{{ from_dttm }}' AND dttm_col < '{{ to_dttm }}'

If I understand correctly, the subsetting happens at query time rather than post hoc on the entirety of the result set. That should speed things up!

Change 811766 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[operations/puppet@production] superset: Turn template processing back on

https://gerrit.wikimedia.org/r/811766

I would also love to see this come back, I have old dashboards that no longer work because templating was turned off. Looking things over, that was turned off for the 1.0 release here (without justification): https://github.com/apache/superset/pull/11172

Patch turns it back on. I would like to have seen what supersets justification was for turning it off, maybe there is something dangerous here? I suppose jinja is allowing users to achieve a minimal form of remote code execution on the superset instance, but in our systems that is plausibly acceptable. Jinja appears to make it difficult to run arbitrary python code, as far as i can tell any functions to be called have to be specially registered.

Thank you @EBernhardson!! And good find!


Also, https://apache.github.io/superset/sqllab.html has this as an example:

SELECT *
FROM some_table
WHERE partition_key = '{{ presto.latest_partition('some_table') }}'

and I don't know if presto.latest_partition() is something that would be available to us or if we'd have to implement it ourselves but either way enabling templating would allow us to have a really useful function (especially when working with mediawiki history!)

I suspect the presto.latest_partition function will come for free, here is a query that i'm hoping will start working again with templating enabled. Here the table name is parameterized (probably not necessary, but DRY's things up a little) and it sources filter variables from another chart in the dashboard. This used to work pre-superset 1.0.

SELECT norm_query, num_sessions, queries
FROM {{ queries_table }}
WHERE date='{{ presto.first_latest_partition(queries_table) }}'
  AND wiki IN ('{{ "','".join(filter_values('wiki', 'mediawikiwiki')) }}')
  AND cardinality(filter(queries, q -> q.num_sessions > 2)) > 1
  AND norm_query <> ''
ORDER BY num_sessions DESC
LIMIT {{ filter_values('limit', 500)[0] }}
BTullis subscribed.

I'm happy with this and since you've already done the work in writing the patch, I'll merge and test it today.

Change 811766 merged by Btullis:

[operations/puppet@production] superset: Turn template processing back on

https://gerrit.wikimedia.org/r/811766

OK, @EBernhardson, @mpopov - The patch has been applied and superset has been restarted with the feature flag enabled. It looks OK to me, but could you test the functionality please when it's convenient? Thanks.

@BTullis thanks! Loading my old dashboard it looks to be working same as it did before. Looks complete to me.

Can confirm templating works! Thank you @EBernhardson @BTullis !

BTullis triaged this task as Medium priority.
BTullis moved this task from Incoming (new tickets) to Visualize on the Data-Engineering board.
BTullis removed a project: Patch-For-Review.

Great! Glad it works as expected.