Page MenuHomePhabricator

Update the metric descriptions of hive tables
Closed, ResolvedPublic

Description

We have worked to update more clear descriptions of each of the data columns and partition keys in the "Labels for hive tables" tab of this workbook.

https://docs.google.com/spreadsheets/d/1F8KTv6qwLmE7OhvoXRnFqgJ4A4QiBQtPttI5TzJrgGI/edit#gid=1789320655

These descriptions should be integrated as updates to the table versions currently in hive.

Event Timeline

JAnstee_WMF changed the task status from Open to In Progress.Mar 21 2023, 4:59 PM
JAnstee_WMF triaged this task as High priority.
JAnstee_WMF moved this task from Backlog to In Development on the Equity-Landscape board.

The year partition column comment needs to be

'Partition year in YYYY format'

or something similar for all the tables for consistency as it has nothing to do with the other columns in the table.

I’ll skip country_meta_data when it comes to country_name and country_code as these don’t exist and their existing equivalents have adequate descriptions.

geography_level doesn’t exists as well

Partition columns require us to drop and recreate the table as the alternative methods are a bit cumbersome.
Most of the columns have been updated with the exception for the

rank_*

tables as those are not tables we will be using in production and the ones below.

Column geography_level does not exist in table ntsako.country_meta_data
Column un_subcontinent_description does not exist in table ntsako.country_meta_data
Column un_continent_description does not exist in table ntsako.country_meta_data
Column monthly bin does not exist in table ntsako.geoeditor_input_metrics_pivot
Cannot alter partition column year in table ntsako.geoeditor_input_metrics_pivot
Column YoY change does not exist in table ntsako.geoeditor_input_metrics_pivot
Column percent_active_editors does not exist in table ntsako.geoeditor_input_metrics_pivot
Column unique_devices_annual_signal does not exist in table ntsako.georeadership_input_metrics
Column pageviews_annual_signal does not exist in table ntsako.georeadership_input_metrics
Column pageviews_annual_change does not exist in table ntsako.georeadership_input_metrics
Column unique_devices_annual_change does not exist in table ntsako.georeadership_input_metrics
Column sum_historical_grants does not exist in table ntsako.grants_leadership_input_metrics
Column affiliate_size_annual_change does not exist in table ntsako.affiliate_leadership_input_metrics
Column affiliate_percent_annual_grants does not exist in table ntsako.grants_leadership_input_metrics
Column affiliate_percent_historical_grants does not exist in table ntsako.grants_leadership_input_metrics
Cannot alter partition column year in table ntsako.population_leadership_input_metrics
Column primary_country_code does not exist in table ntsako.affiliate_data_input_metrics
Cannot alter partition column year in table ntsako.affiliate_data_input_metrics