Page MenuHomePhabricator

Replace or deprecate WMCS uses of report updater
Open, Needs TriagePublic

Description

Data Engineering is trying to deprecate report updater. We have seem to have some uses cases, here:

https://gerrit.wikimedia.org/r/plugins/gitiles/analytics/reportupdater-queries/+/refs/heads/master/wmcs/

https://wmcs-edits.wmflabs.org/

I am not 100% sure that the latter makes use or report-updater; I'm hoping @bd808 or @srishakatux can shed some light.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

@Milimetric, do you know the answer to this question?

If I understand correctly, folks are basically wondering if the https://analytics.wikimedia.org/published/datasets/periodic/reports/metrics/wmcs/ data comes from https://gerrit.wikimedia.org/r/plugins/gitiles/analytics/reportupdater-queries/+/refs/heads/master/wmcs/ or somewhere else. I picked you to poke because you've helped fix the pipeline when it has become broken before (T252915, T310317). I'm pretty sure the data used to be collected via reportupdater, but I have no idea how things work today.

Thanks @bd808 - I wasn’t aware that was the output and based on those recent-ish tickets I am confident that this is still being used and generated by ReportUpdater.

We will migrate the job to our new batch data pipeline tool Airflow.