Page MenuHomePhabricator

Disable Hive querying in Superset
Closed, ResolvedPublic3 Estimated Story Points

Description

Because most Hive queries would timeout with the way Superset works, it should be limited to querying fast stores like Druid or Presto.

Event Timeline

Milimetric added a subscriber: Ottomata.

Change 514487 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Remove Analytics Hadoop proxy config for superset

https://gerrit.wikimedia.org/r/514487

Change 514487 merged by Elukey:
[operations/puppet@production] Remove Analytics Hadoop proxy config for superset

https://gerrit.wikimedia.org/r/514487

The hadoop masters have been rebooted, so the change is now in effect. Since proper auth is not in place yet, superset needs to be manually cleaned up. I tried to delete the Hive database, but I got:

Cannot delete a database that has tables attached. Here's the list of associated tables: event.mediawiki_revision_create, wmf.Joseph Allemandou-Hive-Untitled Query-HkfPQgfmP7, event.Andrew Otto-Hive-Untitled Query-HyZCFZmNO7, wmf.edit_hourly

The only ones that I was able to remove were event.mediawiki_revision_create and wmf.edit_hourly, but the other ones are not listed anywhere.

Found a way, removed tables and finally the Hive database from superset.

elukey set the point value for this task to 3.Jul 1 2019, 10:25 AM
elukey moved this task from Next Up to Done on the Analytics-Kanban board.

Mentioned in SAL (#wikimedia-analytics) [2019-07-01T10:26:11Z] <elukey> removed Hive tables and Database from Superset - T223919