Page MenuHomePhabricator

NEW/CHANGE FEATURE REQUEST: Make Event Registration Tool's data available in Data Lake
Closed, DeclinedPublicFeature

Description

Data Platform Request Form

Is this a request for a:

  • Dataset
  • Data Pipeline
  • Data Feature

Is this a change to something existing:

  • Yes - please provide details of existing datasets/data pipelines (wiki links, Git URL, names of jobs, etc)
  • No

If a new dataset, has this been through the essential metric review? (need link):

  • Yes
  • No

Please provide the description of your request:
Please make Event Registration Tool's data that is stored in x1 available in Data Lake.

This request is similar to this ticket.

Use Case: (Please briefly explain what this feature will be used for):
Data scientists/analysts need to be able to join the Event Registration Tool's data onto the data in Data Lake in order to conduct analyses. The need for this has been growing and affects multiple data scientists/analysts.

Here's a ticket showing a specific use case, but this request applies more broadly to many use cases like this one.

For more context, here's the slack thread about the need for this.

Ideal Delivery Date:
No specific date, but the sooner the better.

Event Timeline

Since both Event Registration and Content Translation store their application data in x1, I wonder if one solution could solve this and T382706: Data pipeline to load cx_translators to Data Lake, at wmf_product

Also wanted to voice support for this because it would unblock some metric pipeline work that @Iflorez has started for T374491: Create ETL pipelines for campaigns-product baseline metrics

P.S. Yes, ideally Event Registration would be instrumented and treat its data as product but that is unlikely to happen any time soon (if ever) and in the meantime there is evaluation/measurement work that is blocked or hindered.

Quick update: with the plans to make Event Registration available on more wikis, analytics for that is going to be really hard to do if it's not instrumented.

@kzimmerman is going to talk to Ilana about Campaigns-Product instrumenting Event Registration as a hypothesis under proposed WE 1.2 KR.

Therefore, the recommendation is to decline this request.

Going forward, we are going to be insistent on instrumentation if product teams need insights about their products, so that we are (1) not perpetuating the reliance on workarounds like sqoop, and (2) actively discouraging anti-patterns like doing analytics from application data directly.