Page MenuHomePhabricator

Set up automated email to report completion of mediawiki_history snapshot and Druid loading
Open, NormalPublic3 Story Points

Description

I can't fully compute our monthly board metrics without the mediawiki_history snapshot and, for our content metrics, the relevant data loaded into Druid so it's available in the Analytics Query Service.

We have only a couple of days between when this usually happens and the due date for the metrics, so knowing as soon as possible once it does would be very helpful.

Event Timeline

Restricted Application changed the subtype of this task from "Deadline" to "Task". · View Herald TranscriptOct 13 2018, 1:50 AM
mpopov moved this task from Triage to Backlog on the Product-Analytics board.Oct 26 2018, 6:48 PM
Neil_P._Quinn_WMF added a subscriber: Milimetric.

@Milimetric, I remember you offering to set this up for me a while ago. Any chance I could take you up on that now? 😁

Sure, no problem. It's probably a good idea to do it together. So, options:

A) set up an oozie job to email you whenever the datasets you need are done
B) set up an oozie job to do the processing of the monthly board metrics

If you want to do A), I can get it started for you, and show you how to test it, then you can test it and we can merge and deploy it
If you want to do B), I would do the same basic setup that I would do for A, but then we should do a hangout and set up the processing step together. And we can do all the rest like testing and stuff together since that'll need more than simple fake data in this case.

Let me know.

Nuria added a subscriber: Nuria.Oct 26 2018, 11:14 PM

Let's see, there 2 things here: when data is available in a snapshot and when data is available publicy in AQS.. Data requires a manual deploy to be available in AQS and other than subscribing yourself to AQS puppet commits I cannot think of an easy way to know of that deployment that right away.

Now, any metric calculations can be done on top of the snapshot data so you really do not need to wait for data to be available in AQS to calculate your metrics, that step will always be further delayed from the data being available. Why don't you set up the board metrics (suggestion B) of @Milimetric as a job that depends on the snapshot work (and reduce) to be completed?

If we want an email to be sent once data is available on the cluster, no need to create a new oozie job. Adding a step sending the email in the currently existing job is easier I think.

Milimetric raised the priority of this task from Normal to High.
Milimetric lowered the priority of this task from High to Low.Jan 7 2019, 5:15 PM

ping @Neil_P._Quinn_WMF, setting to low priority for now

Neil_P._Quinn_WMF added a comment.EditedJan 9 2019, 8:10 PM

Sure, no problem. It's probably a good idea to do it together. So, options:

A) set up an oozie job to email you whenever the datasets you need are done
B) set up an oozie job to do the processing of the monthly board metrics

If you want to do A), I can get it started for you, and show you how to test it, then you can test it and we can merge and deploy it
If you want to do B), I would do the same basic setup that I would do for A, but then we should do a hangout and set up the processing step together. And we can do all the rest like testing and stuff together since that'll need more than simple fake data in this case.

Sorry about the delay! 😛 I would just like to do A. B might be worthwhile, but it's a much bigger task that we can consider separately.

If we want an email to be sent once data is available on the cluster, no need to create a new oozie job. Adding a step sending the email in the currently existing job is easier I think.

Yeah, that sounds perfect for us—we just want an email to product-analytics@wikimedia.org when the data is is available in the Data Lake. I did ask for an email when it's publicly available in AQS too, but that's a lower priority :)

Nuria raised the priority of this task from Low to Normal.Jan 11 2019, 1:02 PM
Nuria assigned this task to fdans.

Ok, moving to kanban and assigning to fdans as background work, e-mail will be sent to product-analytics@wikimedia.org when data is available

Change 484657 had a related patch set uploaded (by Fdans; owner: Fdans):
[analytics/refinery@master] [wip] Change email send workflow to notify of completed jobs

https://gerrit.wikimedia.org/r/484657

Nuria moved this task from In Code Review to Done on the Analytics-Kanban board.Feb 12 2019, 4:03 PM
Nuria moved this task from Done to In Code Review on the Analytics-Kanban board.

Change 484657 merged by Joal:
[analytics/refinery@master] Change email send workflow to notify of completed jobs

https://gerrit.wikimedia.org/r/484657

THis change should get deployed by next week , jobs need to be restarted and once is done the product analytics team should be notified of snapshot being available for February snapshot (to occur the 1st week of March)

fdans moved this task from Ready to Deploy to Done on the Analytics-Kanban board.Feb 20 2019, 6:31 PM
Nuria set the point value for this task to 3.Feb 25 2019, 10:45 PM
Nuria closed this task as Resolved.
Neil_P._Quinn_WMF reopened this task as Open.Mar 11 2019, 5:42 PM

THis change should get deployed by next week , jobs need to be restarted and once is done the product analytics team should be notified of snapshot being available for February snapshot (to occur the 1st week of March)

The February snapshot is now available, but we didn't get an email. Looking at the patch, it seems like it doesn't actually include our email address (product-analytics@wikimedia.org) anywhere, but I could be missing something.

Nuria added a comment.Mar 13 2019, 9:35 PM

Looking at the patch, it seems like it doesn't actually include our email address (

That is correct the patch is generic, it is the job which should be configured with the e-mail address in question

Ok, the workflow has the success e-mail: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0131427-181112144035577-oozie-oozi-C/

But see e-mail is: product-analytics@wikimedia.org and job was configured to send e-mail to productanalytics@wikimedia.org

https://hue.wikimedia.org/oozie/list_oozie_workflow_action/0140257-181112144035577-oozie-oozi-W%40send_email/

<email xmlns="uri:oozie:email-action:0.1">

<to>productanalytics@wikimedia.org</to>
<subject>Job completed - Oozie Job mediawiki-history-check_denormalize-wf-2019-02</subject>
<body>Job details:
  - Job id: mediawiki-history-check_denormalize-wf-2019-02
  - Hue link: https://hue.wikimedia.org/oozie/list_oozie_workflows

The job above has finished successfully and data is now available :)

  • Oozie</body>

</email>

Nuria closed this task as Resolved.Mar 18 2019, 4:15 PM
Neil_P._Quinn_WMF reopened this task as Open.Apr 17 2019, 2:16 AM

We actually did not get an email announcing the completion of the March snapshot. When I look at the Oozie coordinator above, the March job has the status "error" rather than "succeeded". Any idea what's going on?

@Neil_P._Quinn_WMF : Indeed the validation job failed for expected reasons (a higher than usual group-bot-removal, a dimension against which our validation is not very stable). We restarted the job manually with higher threshold: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0171576-181112144035577-oozie-oozi-C
In this job, the email was configured to be sent to multiple emails, but it seems it failed (I didn't received it either): https://hue.wikimedia.org/oozie/list_oozie_workflow_action/0171577-181112144035577-oozie-oozi-W%40send_success_email/?coordinator_job_id=0171576-181112144035577-oozie-oozi-C.
Let's double check again next month if the email gets sent.

Thanks, @JAllemandou! Sounds like a good plan.

Neil_P._Quinn_WMF added a comment.EditedFri, May 3, 6:05 PM

We just got the email announcing the April snapshot—thank you!

One request: can we change the email so it's clearer to analysts not familiar with Data Lake internals? The current email is this:

SUBJECT: Job completed - Oozie Job mediawiki-history-check_denormalize-wf-2019-04

Job details:

The job above has finished successfully and data is now available :)
— Oozie

It would be a clearer if were something like this:

SUBJECT: mediawiki_history for 2019-04 now available

The Oozie job calculating and checking the 2019-04 mediawiki_history snapshot has finished successfully and the data is now available :)

Job details:

If this would be a big effort, then we can leave it as is, but if it's relatively easy, it would be a nice enhancement 😁

Change 508602 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[analytics/refinery@master] Make success email a little friendlier

https://gerrit.wikimedia.org/r/508602

Change 508602 merged by Joal:
[analytics/refinery@master] Make success email a little friendlier

https://gerrit.wikimedia.org/r/508602