Page MenuHomePhabricator

Enable notifications for completion of Hive table snapshots
Open, MediumPublic

Description

<Delete this line: Please set priority>

Background/Goal

We use data from certain Hive tables for regular metrics reporting. It would be useful to know when the snapshots are ready so we can know 1. if our internal jobs that rely on the latest snapshots were able to run successfully and 2. if we can manually run and calculate the metrics or analyze using jupyter notebooks.

KR/Hypothesis(Initiative)

This could fall under the Hypothesis
SDS2.6.1 DATA QUALITY -"If we provide a consistent way to collect specific data quality metrics on a dataset in the Data Lake, and we implement it for key datasets like web requests and Mediawiki history, data producers can view the output in dashboards after each run of the pipeline and receive alerts."

But I'll leave it to the team to change if required.

Success metrics

  • How we will measure success

Example areas:

  • Deadlines
  • User satisfaction
  • Performance
  • Accessibility
  • Maintenance
  • Movement impact
  • Scalability
  • Data Quality
  • Integration
  • Compliance

In scope

Out of Scope

  • known boundaries

Artifacts & Resources

Link to diagrams
Link to specifications, architecture and design docs
Link to product one pagers

Event Timeline

Mayakp.wiki renamed this task from Enable notifications to Enable notifications for completion of Hive table snapshots.Feb 13 2024, 7:50 PM
Mayakp.wiki triaged this task as Medium priority.
Mayakp.wiki added a project: Movement-Insights.

Created subtask for short term work. We will keep parent ticket for longer term work.