Page MenuHomePhabricator

SQL query failed on superset SQL lab
Closed, ResolvedPublic

Description

Hi Expert,

I have some issue while I am using presto database at SQL lab on superset. My sql failed very often. It usually goes offline in about 20 seconds and eventually return below pink message. Here is a SQL example I tried. This query can run pass at presto terminal, but failed at SQL lab.

return message:

<!DOCTYPE html> <html lang="en"> <meta charset="utf-8"> <title>Wikimedia Error</title> <style> * { margin: 0; padding: 0; } body { background: #fff; font: 15px/1.6 sans-serif; color: #333; } .content { margin: 7% auto 0; padding: 2em 1em 1em; max-width: 640px; } .footer { clear: both; margin-top: 14%; border-top: 1px solid #e5e5e5; background: #f9f9f9; padding: 2em 0; font-size: 0.8em; text-align: center; } img { float: left; margin: 0 2em 2em 0; } a img { border: 0; } h1 { margin-top: 1em; font-size: 1.2em; } .content-text { overflow: hidden; overflow-wrap: break-word; word-wrap: break-word; -webkit-hyphens: auto; -moz-hyphens: auto; -ms-hyphens: auto; hyphens: auto; } p { margin: 0.7em 0 1em 0; } a { color: #0645ad; text-decoration: none; } a:hover { text-decoration: underline; } code { font-family: sans-serif; } .text-muted { color: #777; } </style> <div class="content" role="main"> <a href="https://www.wikimedia.org"><img src="https://www.wikimedia.org/static/images/wmf-logo.png" srcset="https://www.wikimedia.org/static/images/wmf-logo-2x.png 2x" alt="Wikimedia" width="135" height="101"> </a> <h1>Error</h1> <div class="content-text"> <p>Our servers are currently under maintenance or experiencing a technical problem. Please <a href="" title="Reload this page" onclick="window.location.reload(false); return false">try again</a> in a few&nbsp;minutes.</p> <p>See the error message at the bottom of this page for more&nbsp;information.</p> </div> </div> <div class="footer"><p>If you report this error to the Wikimedia System Administrators, please include the details below.</p><p class="text-muted"><code>Request from 71.204.128.245 via cp4031 frontend, Varnish XID 687742109<br>Error: 503, Backend fetch failed at Thu, 07 May 2020 23:33:57 GMT</code></p> </div> </html>

my SQL

SELECT substr(log_timestamp,1,6) AS month_time, COUNT(1) AS blocks 
FROM wmf_raw.mediawiki_logging WHERE snapshot ='2020-03' AND log_type IN ('block’) 
AND substr(log_timestamp,1,6) >= '202001' AND substr(log_timestamp,1,6) <= '202004’ 
GROUP BY substr(log_timestamp,1,6)

Event Timeline

Hi @jwang, this doesn't answer your SQL lab question, but could the event.mediawiki_user_blocks_change help you here?

presto> use analytics_hive.event;
USE
presto:event> show create table event.mediawiki_user_blocks_change;
                                                                        Create Table
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 CREATE TABLE analytics_hive.event.mediawiki_user_blocks_change (
    blocks row(account_create boolean, email boolean, expiry_dt varchar, name boolean, user_talk boolean),
    comment varchar,
    database varchar,
    meta row(domain varchar, dt varchar, id varchar, request_id varchar, schema_uri varchar, topic varchar, uri varchar, stream varchar),
    performer row(user_edit_count bigint, user_groups array(varchar), user_id bigint, user_is_bot boolean, user_registration_dt varchar, user_text varchar),
    user_groups array(varchar),
    user_text varchar,
    prior_state row(blocks row(account_create boolean, email boolean, expiry_dt varchar, name boolean, user_talk boolean)),
    user_id bigint,
    _schema varchar,
    datacenter varchar,
    year bigint,
    month bigint,
    day bigint,
    hour bigint
 )

It is unlikely that Presto will be able to query the wmf_raw.mediawiki_logging table given its size.

This comment was removed by Ottomata.

Some more info on querying mediawiki_* tables snaphsots in wmf_raw database from Presto:

  • They are stored in avro file format. This format is not columnar, therefore a lot less performant when querying in analytics fashion (gruop-by/aggregate). We could store the data in parquet to faciliate direct querying.
  • The data is partitioned by snapshot (every snapshot contains the full history) and also by wiki_db. If you're interested in a single wiki, using wiki_db = 'X' will help a lot in reducing the volume of data to be read.
  • Finally Presto is currently running on 5 nodes, meaning it has local-access to only a small amount of the data stored in HDFS. Presto therefore spends a lot of time downloading hdfs-blocks living on different hosts to do its computation. We plan on collocating presto on all hadoop-nodes, but it's not yet done.

could the event.mediawiki_user_blocks_change help you here?

@jwang and I've talked about using that table, but one of the issues is that it's not logging block parameters for partial blocks. I filed T252455 about that.

I tried this query in Presto and (with modifications) works but for any presto standard is quite slow (due to the factors that @JAllemandou has mentioned) so while it finally works it takes about 1-2 minutes and that is too long for a query that might power a UI dashboard. Limiting by wiki takes about 20 secs to run and that should be fast enough to be able to power a UI.

Jennifer, here's another little tidbit that shows looking over more data:

-- this will count all periods of time with active blocks, even if blocks didn't change
-- so it's definitely not the same thing you're looking for in the logging table, but should
-- be more data so is a good performance test.  If it looks interesting, maybe we can talk more
-- about the history tables.

 select wiki_db,
        substr(start_timestamp, 1, 7) as month_time,
        count(*) as periods_of_users_histories_with_active_blocks
   from wmf.mediawiki_user_history
  where snapshot='2020-04'
    and cardinality(user_blocks_historical) > 0
  group by wiki_db,
        substr(start_timestamp, 1, 7)
;

closed, please reopen if help is still needed