Page MenuHomePhabricator

NEW BUG REPORT remove mysql databases from SQLLab
Closed, ResolvedPublicBUG REPORT

Assigned To
Authored By
Mayakp.wiki
May 19 2023, 5:08 PM
Referenced Files
F41649561: image.png
Jan 3 2024, 11:56 AM
F37142595: image.png
Jul 18 2023, 10:20 AM
F37028668: image.png
May 22 2023, 9:38 AM
F37021427: Screenshot 2023-05-19 at 10.03.21 AM.png
May 19 2023, 5:08 PM

Description

Data Engineering Bug Report or Data Problem Form.

Please fill out the following

What kind of problem are you reporting?

  • Access related problem
  • Service related problem
  • Data related problem
For a data related problem:
  • Is this a data quality issue? NO
  • What datasets and/or dashboards are affected? SQLLab in Superset via Presto
  • What are the observed vs expected results? Please include information such as location of data, any initial assessments, sql statements, screenshots.

As discussed in the PA-DE sync on May 16, 2023
We were wondering if it would be worth removing access to the following databases from SQL Lab dropdown if they aren't being used anymore.

  • mysql_staging
  • mysql examples
  • mysql wikishared

Screenshot 2023-05-19 at 10.03.21 AM.png (300×411 px, 27 KB)

Note: The staging table is used a lot by folks on the Research Team, so it’s worth checking in with them. It might not be used in SQLLab, though.

For the DE Team to fill out
Which systems does this effect?
  • Hive
  • Druid
  • Superset
  • Turnilo
  • WikiDumps
  • Wikistats
  • Airflow
  • HDFS
  • Goblin
  • Scqoop
  • Dashiki
  • DataHub
  • Spark
  • Jupyter
  • Modern Event Platform
  • Event Logging
  • Other
Impact Assessment:

Does this problem qualify as an incident?

  • Yes
  • No

Does this violate an SLO?

  • Yes
  • No
Value CalculatorRank
Will this improve the efficiency of a teams workflow?1-3
Does this have an effect of our Core Metrics?1-3
Does this align with our strategic goals?1-3
Is this a blocker for another team?1-3

Event Timeline

Thanks for the suggestion @Mayakp.wiki

It looks like it's a simple job to remove access to these databases from SQLLab, whilst leaving them available for use in existing dashboards.
e.g. for mysql_staging

image.png (590×1 px, 101 KB)

Note: The staging table is used a lot by folks on the Research Team, so it’s worth checking in with them. It might not be used in SQLLab, though.

This is the main unknown, as far as I am concerned. Does anyone use this functionality, on the research team or any other team?

I can reach out to some teams to find out if anyone would miss this functionality in SQL Lab, if it were to disappear.

glad to know this is easy to do ! :)
I guess a Slack chat (#working-with-data) or tagging the team on this task would be helpful to get buy-in.

image.png (406×1 px, 39 KB)

I've now updated the superset configuration to exclude the wikishared and mysql_staging databases from SQL Lab, as requested.

If it turns out that people were using it and complain that they have disappeared, then I will re-enable them.

ty so much @BTullis for the quick turnaround on this! :)

BTullis claimed this task.

I have requested for wikishared database to be enabled to setup a monitoring dashboard for The Wikipedia Library eligibility echo notifications being sent daily, more context in T347060

As per the request from @KCVelaga, I have re-enabled SQL Lab access for the wikishared database.

image.png (537×1 px, 62 KB)