Page MenuHomePhabricator

Add column descriptions to existing tables in wmf_product
Closed, ResolvedPublic

Description

Even though we have column descriptions in the CREATE queries for tables in wmf_product, some tables were not created using those queries but were created by various members of the team under their personal databases and then transferred under wmf_product at a later date. As a result, those tables do not have documentation in Hive Metastore.

This has an unfortunate side effect of those fields not having descriptions in our data catalog MVP (e.g. editor_month; CREATE query)

We would need to alter them (see https://www.tutorialspoint.com/hive/hive_alter_table.htm)

Event Timeline

mpopov triaged this task as Low priority.
mpopov moved this task from Triage to Current Quarter on the Product-Analytics board.
Iflorez updated the task description. (Show Details)

wmf.product tables in Datahub with schema and documentation information filled in:

  • active_editors
  • commons_search_counts
  • content_interactions
  • editor_month
  • global_markets_pageviews
  • google_translate_pageviews
  • mediasearch_filter_change_aggregates
  • mediasearch_filters_per_session_aggregates
  • mediasearch_success_aggregates
  • mh_pageviews_corrected
  • new_editors
  • pageviews_corrected
  • ve_media_funnel_aggregates
  • wikipediapreview_stats
  • wikipedia_preview_stats_new

@Mayakp.wiki which documentation do you recommend using for mh_pageviews_corrected?

@Iflorez we could mention that this table and its metric is now retired, and was formerly used to calculate pageviews from mobile heavy wiki projects, listed here - https://github.com/wikimedia-research/Readers-movement-metrics/blob/main/queries/update_mobileheavy_pageviews_table.sql

Updated remaining documentation on Datahub.
Closing this task.