Page MenuHomePhabricator

Remove goransm.wdcm_maintable from the Data Lake
Closed, ResolvedPublic

Description

  • The new WDCM ETL procedures in Pyspark work directly with the goransm.wdcm_clients_wb_entity_usage table, so
  • The wdcm_maintable in goransm, HDFS, WMF Data Lake, is not needed for WDCM or any other related system anymore.

Remove the table completely.

Event Timeline

  • WDCM Geo Dashboard does not depend upon this table anymore.
  • The WDCM Biases Dashboard is the only remaining dashboard whose back-end relies on this Hive table.
  • As soon as the changes are implemented there, the table will be removed from hdfs.
  • wdcm_maintable removed from hdfs;
  • all WDCM dashboards now running Apache Spark supported update engines;
  • resolved.