Page MenuHomePhabricator

Set up Hive aggregation table and populate with sample data
Closed, ResolvedPublic

Description

In order to develop the Superset dashboards, we need some scraper data—ideally several months of it. To unblock that work, we will do the following:

Outcome

I ran the aggregation script three times to create "mock" data with three datapoints. The data can be accessed from Superset now using presto_analytics_iceberg and the wmde_fisch schema.

e.g.

SELECT *
FROM "wmde_fisch"."wiki_page_cite_references_monthly"

Event Timeline

awight updated the task description. (Show Details)
WMDE-Fisch subscribed.

Trying to use the first outputs in my personal DB

Trying to use the first outputs in my personal DB

This works quite well, we can play with the data from the local runs in our local DBs. You just need to make sure to pick presto_analytics_iceberg as root DB source. That's were the data lands atm.

WMDE-Fisch updated the task description. (Show Details)
awight claimed this task.