Page MenuHomePhabricator

Upgrade Superset to 1.3.1 or higher
Closed, ResolvedPublic

Description

We're currently on 1.0.

An incentive to upgrade from @nettrom_WMF (other than general enhancements and security):

Looks like a new version of Superset allows move/delete/config for rows/charts. It’s not intuitive how to currently delete an empty row in a dashboard (solution: add a markup cell, then delete it).

Event Timeline

razzi renamed this task from Upgrade Superset to 1.2 to Upgrade Superset to 1.3.Sep 9 2021, 4:08 PM
razzi moved this task from Next Up to In Progress on the Analytics-Kanban board.

Superset 1.3 is deployed to staging, test it out via ssh -NL 8080:an-tool1005.eqiad.wmnet:80 an-tool1005.eqiad.wmnet and then access http://localhost:8080/superset/welcome/ (or just localhost:8080, it will redirect)

Comment here with any issues you find! Once we're confident everything is stable we'll release to production!

Tried to quickly check and I noticed that presto dashboards were not working (due to the krb settings mentioning an-tool1010). I tried to update the settings but when saving (under other options) I got An error occurred while fetching databases: "Connection failed, please check your connection settings".

Tried to check https://github.com/apache/superset/blob/master/UPDATING.md but didn't find anything relevant.

https://stackoverflow.com/questions/69043717/apache-superset-1-3-error-connecting-to-presto-over-sasl looks related, but haven't checked in depth yet!

I have tried to create a presto_elukey database entry for Presto, adding {"connect_args":{"KerberosConfigPath":"/etc/krb5.conf","KerberosKeytabPath":"/etc/security/keytabs/superset/superset.keytab","KerberosPrincipal":"superset/an-tool1005.eqiad.wmnet@WIKIMEDIA","KerberosRemoteServiceName":"presto","requests_kwargs":{"verify":"/etc/ssl/certs/Puppet_Internal_CA.pem"}}} (note the an-tool1005 instead of 1010 value) and I was able to get it working (namely the connection test seems working fine).

I suspect that there is an issue in Superset when trying to update the settings that already don't work, not sure why.

Interesting @elukey, perhaps it's a regression in 1.3; never ran into that error in any prior versions. Thanks for troubleshooting and implementing that workaround!

Tried to do more tests on this, and in my test presto database settings I didn't tick Security -> Impersonate users (like it is now for presto_analytics_hive). If I do it, I get the pop up error and this in the logs:

Sep 20 07:51:23 an-tool1005 superset[16983]: Unable to load dialect <class 'sqlalchemy.dialects.mssql.adodbapi.MSDialect_adodbapi'>: type object 'MSDialect_adodbapi' has no attribute 'dbapi'
Sep 20 07:51:23 an-tool1005 superset[16983]: 2021-09-20 07:51:23,338:WARNING:superset.db_engine_specs:Unable to load dialect <class 'sqlalchemy.dialects.mssql.adodbapi.MSDialect_adodbapi'>: type object 'MSDialect_adodbapi' has no attribute 'dbapi'
Sep 20 07:51:23 an-tool1005 superset[16983]: Unable to load SQLAlchemy dialect: <module 'sqlalchemy.dialects.sybase' from '/srv/deployment/analytics/superset/venv/lib/python3.7/site-packages/sqlalchemy/dialects/sybase/__init__.py'>
Sep 20 07:51:23 an-tool1005 superset[16983]: 2021-09-20 07:51:23,353:WARNING:superset.db_engine_specs:Unable to load SQLAlchemy dialect: <module 'sqlalchemy.dialects.sybase' from '/srv/deployment/analytics/superset/venv/lib/python3.7/site-packages/sqlalchemy/dialects/sybase/__init__.py'>

Definitely some more work to be done here - for any "end users" seeing this ticket, know that we're aware most things aren't working, and we'll be troubleshooting it, so there's no utility in looking at your own charts / dashboards for the time being.

I have upgraded the staging superset to 1.3.1

razzi renamed this task from Upgrade Superset to 1.3 to Upgrade Superset to 1.3.1 or higher.Sep 29 2021, 5:17 PM

I've been looking into this too and I've come across something that might help, or it might be that I'm simply confused by it...

Firstly, we can see from the installation docs that the database connectors for presto, druid etc. aren't shipped with superset itself. They are additional python modules that are required.

I see that CR that you have for the analytics/superset/deploy repo, upgrading it to version 1.3.1
This includes lots of binary wheels in the /artifacts subdirectory, which I would expect.

It also includes an update to the frozen-requirements.txt file which specifically mentions upgrading these required database dependencies.

apache-superset[presto,mysql,druid]==1.3.1

The build_wheels.sh script seems to say that we should install these wheels from this file:

# Install all dependencies into artifacts/$dist.  This should only be run
# on a build server.
{...snip...}
requirements_file=${1:-frozen-requirements.txt}
{...snip...}
$build_venv/bin/pip wheel --trusted-host pypi.org --trusted-host files.pythonhosted.org -w $wheels_dir -r $deploy_dir/$requirements_file

However, I can't see the binary packages for these database drivers being created in the artifacts directory.
In fact, I can't find these packages anywhere on tool1005 (staging superset), but confusingly I can't find them on tool1010 (production superset) either. Maybe I'm looking in the wrong place.

To make matters more confusing, I can see some possibly outdated information in create_virtualenv.sh

Until Wikimedia started using a fork, we used -r frozen-requirements.txt
to install superset and other requirements. However, now that
frozen-requirements.txt is pointing at the wikimedia github fork url,
pip install with it will not be able to find superset in the wheels dir.
Instead, just install all wheels in the wheels dir.

Are we still using our fork of Superset, or have we reverted to upstream?

Apologies if I've misunderstood anything about the deployment process or our history with Superset.

The wheels are stored under:

elukey@an-tool1005:~$ ls /srv/deployment/analytics/superset/deploy/artifacts/buster/
aiohttp-3.7.4.post0-cp37-cp37m-manylinux2014_x86_64.whl
alembic-1.7.3-py3-none-any.whl
amqp-2.6.1-py2.py3-none-any.whl
apache_superset-1.3.1-py3-none-any.whl
apispec-3.3.2-py2.py3-none-any.whl
async_timeout-3.0.1-py3-none-any.whl
attrs-21.2.0-py2.py3-none-any.whl
Babel-2.9.1-py2.py3-none-any.whl
backoff-1.11.1-py2.py3-none-any.whl
[..]

Are those the ones that you were looking for? If so the dir is the one that scap targets, and the venv gets created in /srv/deployment/analytics/superset/venv/.

Edit: after re-reading I see that you mentioned "database drivers", sorry I missed it, can you give more info about those ones?

The comment is indeed outdated, we used to have fork simply because we wanted to backport patches on top of the last version to test/report-to-upsteam/etc.. Nowadays I think we don't do it anymore, so the comment surely needs to be updated.

Thanks @elukey - Yes, that's where I'm looking too.
What I'm searching for are the specific database drivers mentioned in the frozen-requirements.txt file. Namely:

apache-supersetpresto==1.3.1
apache-supersetmysql==1.3.1
apache-supersetdruid==1.3.1

I'm just wondering if our deployment process has somehow managed to skip updating these drives along with the core of superset.

The extra presto,mysql,druid should only install the related dependencies, namely PyHive (for hive/presto), PyDruid and mysql. IIUC there shouldn't be any wheel named apache-supersetpresto etc..

Oh I see. I was mis-reading this table then: https://github.com/apache/superset/blob/master/docs/installation.rst#database-dependencies

I thought it was suggesting that there was a PyPI module per database connector, including one for presto.

Screenshot from 2021-09-30 13-49-11.png (1×923 px, 185 KB)

Oh well, I still can't explain it then.

razzi changed the task status from Open to In Progress.Oct 18 2021, 10:37 PM
razzi added a project: User-razzi.
razzi moved this task from In Code Review to In Progress on the Analytics-Kanban board.

Hey @razzi, what about if we enable user impersonation with a SQL statement, rather than by ticking the box.

As in...

UPDATE dbs SET impersonate_user='1' WHERE database_name="Presto-Razzi";

MariaDB [superset_staging]> select * from dbs where database_name="Presto-Razzi"\G;
*************************** 1. row ***************************
                       created_on: 2021-10-19 15:31:42
                       changed_on: 2021-10-19 15:31:42
                               id: 18
                    database_name: Presto-Razzi
                   sqlalchemy_uri: presto://an-coord1001.eqiad.wmnet:8281/analytics_hive?protocol=https
                    created_by_fk: 368
                    changed_by_fk: 368
                         password: NULL
                    cache_timeout: NULL
                            extra: {"metadata_params":{},"engine_params":{"connect_args":{"KerberosConfigPath":"/etc/krb5.conf","KerberosKeytabPath":"/etc/security/keytabs/superset/superset.keytab","KerberosPrincipal":"superset/an-tool1005.eqiad.wmnet@WIKIMEDIA","KerberosRemoteServiceName":"presto","requests_kwargs":{"verify":"/etc/ssl/certs/Puppet_Internal_CA.pem"}}},"schemas_allowed_for_csv_upload":[]}
        select_as_create_table_as: 0
                       allow_ctas: 0
                 expose_in_sqllab: 1
                force_ctas_schema: NULL
                  allow_run_async: 0
                        allow_dml: 0
                     verbose_name: NULL
                 impersonate_user: 0
allow_multi_schema_metadata_fetch: 0
                 allow_csv_upload: 0
                  encrypted_extra: NULL
                      server_cert: NULL
                       allow_cvas: 0
                             uuid: W��|0�G��DX6�Ab�
             configuration_method: sqlalchemy_form
1 row in set (0.001 sec)

ERROR: No query specified

I got it working, it is a simple move in the configuration panel for the Presto Database. I moved the "Other" additional settings (listed below) to the "Security->Secure extra" settings and I was able to save the settings (with impersonation) and to query presto data from superset.

{"connect_args":
  {"KerberosConfigPath":"/etc/krb5.conf",
   "KerberosKeytabPath":"/etc/security/keytabs/superset/superset.keytab",
   "KerberosPrincipal":"superset/an-tool1005.eqiad.wmnet@WIKIMEDIA",
   "KerberosRemoteServiceName":"presto",
   "requests_kwargs":
     {"verify":"/etc/ssl/certs/Puppet_Internal_CA.pem"}}}

Upstream must have changed the way to specify security features in the UI, that requires a manual action after the upgrade. Please test Superset staging and let me know if it works, in my opinion the blocker should be resolved.

As follow up I'd check random dashboards and try to play with the sql lab before deciding to upgrade, there may be more surprises that are waiting to show up :D

Sounds good @elukey. Us on Data Engineering will test for the next week, then announce a "release candidate" when we sync with Product Analytics Oct 27, if a week passes and they don't have any blockers we'll release!

I've checked Superset again a few more times and not run into any issues.

I haven't encountered any issues and nobody has reported any, so after one last pass through the staging instance, I'll roll the update out to production today!

The upgrade finished successfully, there was about 30 minutes of downtime as an unexpected database error blocked the migration. I put the ssh session in my notes here: https://wikitech.wikimedia.org/wiki/User:Razzi/Superset_1.3.1_upgrade_recap and I intend to incorporate the learnings into the administration notes.

razzi moved this task from Q2 2021/2022 to Done on the Analytics-Clusters board.
razzi moved this task from Ready to Deploy to Done on the Data-Engineering-Kanban board.

@razzi in the Wikitech link there seems to be no final db upgrade done after restoring the superset db dump, did it work at the end?

From some quick research (see https://mariadb.com/kb/en/troubleshooting-row-size-too-large-errors-with-innodb/) it seems that we got a similar error for the 1.0 migration, we solved it using the ROW=DYNAMIC trick for the affected table (T272390#6842923).

@elukey everything is working! I'm not sure what you're referring to with "final db upgrade"; there is a superset db migrate followed by a superset init which as I understand is sufficient - am I missing something?

We did run into an issue where Alpha role users couldn't use sqllab, which we fixed by copying the one permission in the sql_lab role that wasn't in Alpha: can sql json on superset. As far as I'm aware there are no outstanding issues. I'm going to go ahead and close this; any follow-ups can be their own ticket.

I was double checking https://wikitech.wikimedia.org/wiki/User:Razzi/Superset_1.3.1_upgrade_recap and at the end (right after the superset_production db restore) there is no db migrate + init (the only occurrences of those commands were mentioned as failed attempts).

It was not clear what was the resolution step, did the db restore circumvent the SQL error?

Anyway, all good :)

Ah yes, @elukey, that was an incomplete paste, I made the embarassing mistake of running a command as superset when I meant to be my user, so I truncated it prematurely :) after I realized my error I finished running the commands

And yes, after the db restore the db migrate just worked.