Page MenuHomePhabricator

Compensate for sampling
Closed, DeclinedPublic1 Estimated Story Points

Description

Any metrics that are based on EditAttemptStep or VisualEditorFeatureUse need to be multiplied by 16 during aggregation, to adjust for the sampling rate. Also, events with is_oversampled true should be omitted.

If possible, backfill the previously aggregated data.

The only affected metrics are those under reportupdater's visualeditor/hive.

Event Timeline

awight updated the task description. (Show Details)
awight moved this task from Sprint Backlog to Doing on the WMDE-TechWish (Sprint-2021-01-20) board.
awight set the point value for this task to 2.

Change 661106 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/WikimediaEvents@master] Update schema to include is_oversample flag

https://gerrit.wikimedia.org/r/661106

Change 661107 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/VisualEditor@master] Track whether an event was oversampled

https://gerrit.wikimedia.org/r/661107

Change 661108 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/reportupdater-queries@master] Compensate for sampling

https://gerrit.wikimedia.org/r/661108

awight moved this task from Doing to Review on the WMDE-TechWish (Sprint-2021-01-20) board.

Change 661106 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Update schema to include is_oversample flag

https://gerrit.wikimedia.org/r/661106

Change 661107 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Track whether an event was oversampled

https://gerrit.wikimedia.org/r/661107

Change 661108 merged by Mforns:
[analytics/reportupdater-queries@master] Compensate for sampling

https://gerrit.wikimedia.org/r/661108

awight added a subscriber: mforns.

Now that the aggregation is deployed, we need to backfill by purging the following data since Jan 1st. @mforns would you be able to help or advise on this?

Graphite:

  • MediaWiki.VisualEditor.templateDialog.open.byEditCount.*.byWiki.*
  • MediaWiki.VisualEditor.templateDialog.open.byMethod.*.byWiki.*
  • MediaWiki.VisualEditor.templateDialog.close.bySaved.*.byEditCount.*.byWiki.*
  • MediaWiki.VisualEditor.templateDialog.*.byEditSaved.*.byWiki.*
  • MediaWiki.VisualEditor.templateDialog.*.byWiki.*

Reportupdater queries:

  • visualeditor/template_dialog_opens_by_edit_count
  • visualeditor/template_dialog_opens
  • visualeditor/template_dialog_parameters_by_edit_success
  • visualeditor/template_dialog_other_events

We're okay with either having a discontinuity at Jan 1, or purging all previous data for these metrics.

@awight
Re. graphite: I haven't ever dealt with back-filling graphite metrics. I'm not sure they can be backfilled, or purged by a given time range. Maybe @elukey knows?
Re. Reportupdater queries: Do you mean the TSV reports generated by those queries? That's easier, we could just delete the reports' contents since Jan 1st. Reportupdater would pick up from there and rerun all dates automatically. If you confirm that's what you want, I'll do that.

New recommendation after discussing with mforns is that we should write the compensated metrics to new Graphite paths, which makes the migration easier because we don't have a tricky timing issue around purging the uncompensated old data points in Graphite.

awight changed the point value for this task from 2 to 1.

Change 666933 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/reportupdater-queries@master] Filter out oversampled events

https://gerrit.wikimedia.org/r/666933

awight updated the task description. (Show Details)

Let's not bother. The discontinuity only affects VisualEditor template dialog metrics, and we're well outside of the 1-month baseline window before VE deployments, during which we would want to keep the numbers stable.

I'll leave a note in our metrics catalog explain when this changed.

Change 666933 merged by Mforns:
[analytics/reportupdater-queries@master] Filter out oversampled events

https://gerrit.wikimedia.org/r/666933