Page MenuHomePhabricator

Purge and migrate deprecated metrics paths
Closed, ResolvedPublic

Description

When it's possible to backfill data into the new structure, we should do so at least until the beginning of our baseline window. Let's say 2021-01-01. If backfilling is impossible, maybe we should archive the old data so that we don't destroy baseline data.

Any of these steps may be broken out into a subtask, when it makes coordination easier.

  • Dependencies: wait to migrate until after these patches are deployed:
  • Perform migration outside of reportupdater cronjob windows, or disable the affected jobs.
  • Purge reportupdater output directories
    • codemirror/users_codemirror_and_wikitext_2017
    • codemirror/users_extension_codeeditor
    • codemirror/users_gadget_wiked
  • Purge from Graphite—deployment TBD, must first compare with new metrics
    • If we decide we need any of these for longer-term comparison, we can rename to e.g. start with archive.- .
    • Most of these are deprecated because we now include aggregation byEditCount, be careful not to purge any paths which include this new dimension.
    • Media[wW]iki.CodeMirror.preferences.byPreference.*.byEnabled.*.byWiki.*
    • MediaWiki.CodeMirror.sessions.byEditor.*.byEnabled.*.byWiki.*
    • MediaWiki.CodeMirror.toggles.byEditor.*byEnabled.*.byWiki.*
  • Backfill metrics
    • Delete reportupdater cached outputs and adjust start_date to cause backfill.
    • TemplateWizard user_edit_count becomes available Jan 15th.
    • VisualEditor template_dialog_* metrics should be backfilled from Jan 1st.
  • No action required
    • TemplateData metric edit count buckets will switch from a custom style to common style. This is fine, we see the group labels switch for graphs spanning the deployment date.
    • TemplateWizard jobs were not deployed yet, so there are no metrics to migrate.

Event Timeline

There's also a case where the path doesn't change, but edit count bucketing has shifted, and we should still purge old data and backfill. For example, https://gerrit.wikimedia.org/r/c/analytics/reportupdater-queries/+/659227/2/templatedata/hive/actions

awight moved this task from Sprint Backlog to Doing on the WMDE-TechWish (Sprint-2021-01-20) board.
awight set the point value for this task to 5.
awight renamed this task from Purge deprecated metrics paths to Purge and migrate deprecated metrics paths.Feb 5 2021, 10:04 AM
awight updated the task description. (Show Details)
awight added subscribers: fgiunchedi, mforns.
awight moved this task from Doing to Review on the WMDE-TechWish (Sprint-2021-02-03) board.

I'd love some review of the plan as detailed in the task description.

howdy @awight saw some chatter around this on the #wikimedia-sre-observability channel and am wondering if there is still input you would like from the team on this matter. Thanks!

awight removed the point value for this task.

howdy @awight saw some chatter around this on the #wikimedia-sre-observability channel and am wondering if there is still input you would like from the team on this matter. Thanks!

Thanks, I've made a subtask T274987 for the Graphite cleanup which might be in your team's area, and will probably have more questions for you once we're ready to do the purging.

awight claimed this task.