Page MenuHomePhabricator

Give the option of using the same parameters for all reports for a given cohort {dove} [21 pts]
Closed, ResolvedPublic

Description

The current workflow when generating a new report:

(1) Pick Cohorts
(2) Pick Metrics
(3) Configure Output
(4) Run Report

Within (2), you can input the time range to run the report, as well as a variety of other parameters (see list below). You have to do this for each metric you want to run a report on (bytes added, edits, etc.). If you are running reports for an event, this creates a duplication of effort because presumably you will want to run the same time range and other params for each report.

Therefore there should be an option to specify any parameters that are used for multiple metrics once, and use those settings for all reports ran.

Options that should be configurable at the global report level:

  • Start date
  • End date
  • Time series by
  • Include deleted
  • Namespace(s)
  • Number of edits
  • Rolling days
  • As of date

Expected behavior: User configures these metrics in a section called "Pick Defaults" above "Pick Metrics" in the report-creation workflow. If a value for start date, namespaces, edits, etc. is specified in this section, then that value will be automatically populated into any metrics the user selects below, if that metrics uses that parameter. If the user subsequently decides to change that value, then they will have to de-select and re-select the metric to get the global value back.

Example:

  1. User specifies in "Pick Defaults" that namespace should be 3,4.
  2. User chooses to report on Edits, Pages Created, and Bytes Added metrics under "Pick Metrics", and sees that Namespace field is already populated with 3,4 for all of those metrics.
  3. User decides that they want to measure Bytes Added for Namespace 0,1 instead, and changes the value of Namespace for that metrics.
  4. User runs their reports, on edits to namespace 3,4 and Bytes added to namespace 0,1.

Version: unspecified
Severity: enhancement

Details

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:58 AM
bzimport set Reference to bz72117.
bzimport added a subscriber: Unknown Object (MLST).
Capt_Swing renamed this task from Give the option of using the same time range for all reports for a given cohort to Give the option of using the same parameters for all reports for a given cohort.Mar 30 2015, 7:28 PM
Capt_Swing updated the task description. (Show Details)

I updated this task with more detail, and generalized it to describe all settings where there is currently redundancy.

kevinator renamed this task from Give the option of using the same parameters for all reports for a given cohort to Give the option of using the same parameters for all reports for a given cohort [21 pts].Apr 7 2015, 3:46 PM
kevinator triaged this task as Medium priority.
kevinator moved this task from Next Up to Tasked_Hidden on the Analytics-Kanban board.
kevinator renamed this task from Give the option of using the same parameters for all reports for a given cohort [21 pts] to Give the option of using the same parameters for all reports for a given cohort {dove} [21 pts].Apr 27 2015, 5:17 PM

@kevinator is this task going to make it into any upcoming sprints? If so, just tell me when. When we first scoped the task, @Nuria suggested that it would be useful to have a mockup of the multi-report config UI to work from, and I agreed to make one. Happy to do so--just let me know what the timeline is.

@Capt_Swing yes, I just moved this task to the top of the list so we honor our commitment to complete this task this quarter. How much time do you need to generate a mockup?
@Milimetric will take it on shortly (when it moves to "in-progress"). He can say more about what he wants out of a mockup.

@kevinator, @Milimetric, I'm OOO tomorrow and Friday. I could have something for you by EOD Monday?

I trust your mockup, J, and we can iterate.

@Milimetric here's a first stab. All the configuration values that are used by more than 1 metric are listed in a "Pick Defaults" section of the Create Report page, which takes the place of "Pick Timezone". Timezone is treated as just another default value (even though it can't be changed at the per-metric level).

I tried to group the fields in a more-or-less sensible way, but it's somewhat arbitrary. Alignment and spacing are also arbitrary in this mock--I tried to conserve vertical space with a two column layout, but that's not strictly necessary.

When the user clicks the "Set defaults" button on the Bytes Added metric pane, any relevant values they have specified in the Pick Defaults section above are populated for that pane, and for that pane only. Pick Defaults fields that don't correspond to any fields in this specific pane are ignored.

The user will need to click "Set defaults" for each metric individually--clicking "Set defaults" on the Edits metric pane doesn't update the Bytes Added pane, and vis versa. This adds an extra step per metric, but it's still quicker and less error-prone than manually setting the same start date, namespace, etc. for multiple metrics.

I put a confirmation banner up at the top, which could appear each time the user sets defaults for a metric. I made it gray so that it isn't confused with the green Validate Configuration banner. But honestly we may not need to show a banner at all: the user can confirm visually that values have been updated.

If the user goes back and updates a value under Pick Metrics after clicking "Set defaults" for an individual metric, they will need to click "Set defaults" again on that metric pane to import the new value--these fields do not auto-update. If they click "Set defaults" multiple times, but haven't changed any of the globals, nothing will happen. If

When the user first visits the Create report page, the fields in "Pick defaults" will contain the same pre-populated values that they currently do in the individual metric panes--so, Start Date will be set to 1 month prior to the current date, Number of edits will be set to 5, etc.

"Validate configuration" works the same as before.

Let me know if this can be built as defined, and/or if you see any issues we should fix before you start coding. IMO it's not ideal, but still an improvement over the current workflow.

wikimetrics_report_config_v1.png (1×1 px, 165 KB)

Looks good to me except I think the list of fields is too broad. I'd propose a shorter list that doesn't touch on standardized defaults:

  • timezone
  • start date
  • end date
  • as of date
  • timeseries by

So my main worry with parameters like "namespaces", "number of edits", "include deleted edits", "rolling days", etc. is those defaults were standardized per metric by the research team. Since they had good reasons for those default values, it makes sense to me to change them individually if at all. Namespaces also works really differently in some metrics, so it's hard to say they should all use the same default. Maybe if we standardized how namespaces work, or thought more about it. I'm open to suggestions.

@Milimetric I guess I'm not as concerned as you are that users will garble their data by changing defaults, if we make it (slightly) easier to change those defaults. I think Wikimetrics users know what parameters they want to set, and how setting those params will affect their data. What I'm hearing from them is that the process of setting up a report is labor-intensive and error-prone, so the less duplication of effort, the better.

All an expansive global defaults pane does is save people clicks and removing sources of "fat finger" error. Only allowing some redundant config options to be set globally, and not others, seems arbitrary to me.

But ultimately any fix is better than no fix :)

If we want more input on which metrics to make globally configurable, and which to hold back, we should ask our users on-list. Gathering and synthesizing that feedback will push back our delivery date, though. I'll leave the final decision up to you and @kevinator, and support it.

Ok, I can add the fields that mean the same thing in every metric they show up in. So that means I'll leave out "namespaces". Because that's just too different in every metric. In some metrics, leaving it blank means "all namespaces". In others, leaving it blank means "validation error". So having a default for that seems likely to cause a lot of confusion. If you're ok with that, I'll make it so tomorrow.

Sounds great to me! Thank you, Dan.

Change 217857 had a related patch set uploaded (by Milimetric):
[WIP] Add global default report fields

https://gerrit.wikimedia.org/r/217857

This is up on staging: https://metrics-staging.wmflabs.org/reports/create/

@Capt_Swing, please play with this and let me know what you think. I feel like it's a little weird, especially if you set a default, add a metric, change a metric parameter from the default, then blank out the default or blank out the metric parameter. I wonder if people will be able to follow what's going on.

It's a little weird to me too. I'm not sure we've quite nailed this yet.

Pre-populate default values consistently: The following fields should be pre-populated in the Pick Defaults section and in the individual metric panes on staging, just like they are in production:

  1. Start Date/End Date should pre-populate [current date] and [current date - 30] respectively. On staging, dates are pre-populated in the individual metrics panes, but not in the Pick Defaults pane.
  2. Include Deleted box should be checked. On staging, Include Deleted isn't checked in either the Pick Defaults section or the individual metrics panes.
  3. Rolling Days should be set to 30. On staging, Rolling Days isn't pre-populated in either the Pick Defaults section or the individual metrics panes.

Remove Number of Edits from Pick Defaults: okay, I see now that the default value for Number of Edits in production differs by metric: for Rolling Active Editor, Rolling New Active Editor, and Rolling Surviving New Active Editor, it's 5; for Survival and Threshold it's 1. It appears that the metric definition for survival specifies n=1, where the rolling metrics specify n=5. So, let's not make this a global at all--please remove number of edits from Pick Defaults, and leave the existing default values (5 and 1) in their respective metric panes.

Remove Set Defaults button: Values in individual metric panes update automatically when you change the default values in Pick Metrics. This isn't the behavior I initially spec'ed, and it makes the "Set Defaults" button somewhat unnecessary. But it may be an improvement: it's handy not to have to click "Set defaults" for every metric. So I suggest we just remove that button. Can you think of a reason we shouldn't?

Update label for End Date in Pick Defaults: since End Date and As Of Date default to the same date in production already, update the label for End Date under Pick Metrics to read End Date/As of Date. This will make it clear to the user that they are changing both at the same time. If someone wants to set a different As of Date for, say, Rolling Active Editor, after they specify a global default value, they can do that in the individual metric pane.

I think that should help clear up the inconsistencies, while still maintaining the usefulness of Pick Defaults. What do you think?

Pre-populate default values consistently: The following fields should be pre-populated in the Pick Defaults section and in the individual metric panes on staging, just like they are in production:

Pre-populating the Pick Defaults section would override some of the per-metric standard defaults that the researchers defined when creating these metrics. We talked about stamping "standard" metrics with a "seal of WMF approval" or something at some point, and I think this change would be counter-productive in that sense.

  1. Start Date/End Date should pre-populate [current date] and [current date - 30] respectively. On staging, dates are pre-populated in the individual metrics panes, but not in the Pick Defaults pane.

The End Date and As of Date are actually both called "end_date" and share some back-end logic that has to do with the recurrent reports running for Vital Signs. So it would be a fairly involved change to separate those two. On top of that, I disagree with the pre-populated Pick Defaults section as I mentioned above.

  1. Include Deleted box should be checked. On staging, Include Deleted isn't checked in either the Pick Defaults section or the individual metrics panes.

Oops, this is a bug, the empty defaults were overriding the default values of the metrics, same applies to the rolling days.

  1. Rolling Days should be set to 30. On staging, Rolling Days isn't pre-populated in either the Pick Defaults section or the individual metrics panes.

This should be pre-populated to 30 in the individual metrics, but again I think it's too confusing for future metrics developers to have a global default that overrides any standard default defined by the researchers.

Remove Number of Edits from Pick Defaults: okay, I see now that the default value for Number of Edits in production differs by metric: for Rolling Active Editor, Rolling New Active Editor, and Rolling Surviving New Active Editor, it's 5; for Survival and Threshold it's 1. It appears that the metric definition for survival specifies n=1, where the rolling metrics specify n=5. So, let's not make this a global at all--please remove number of edits from Pick Defaults, and leave the existing default values (5 and 1) in their respective metric panes.

Done, I'll deploy this to staging soon.

Remove Set Defaults button: Values in individual metric panes update automatically when you change the default values in Pick Metrics. This isn't the behavior I initially spec'ed, and it makes the "Set Defaults" button somewhat unnecessary. But it may be an improvement: it's handy not to have to click "Set defaults" for every metric. So I suggest we just remove that button. Can you think of a reason we shouldn't?

No, that makes sense. Done, will be in next deploy to staging.

Update label for End Date in Pick Defaults: since End Date and As Of Date default to the same date in production already, update the label for End Date under Pick Metrics to read End Date/As of Date. This will make it clear to the user that they are changing both at the same time. If someone wants to set a different As of Date for, say, Rolling Active Editor, after they specify a global default value, they can do that in the individual metric pane.

Good idea. Done.

Let me know what you think about the other stuff.

I don't quite get why you don't want to pre-populate dates/include deleted/rolling days in Pick Defaults. these options are all set by default to the same values in individual metric panes: in the current production deployment of Wikimetrics, start date is the same for all metrics that use start date, include deleted is checked by default for all metrics that offer the checkbox, etc. If they're all pre-populated to the same values in the individual metric panes, why not do the same thing up top in Pick Defaults?

Otherwise, it all sounds good to me!

I don't quite get why you don't want to pre-populate dates/include deleted/rolling days in Pick Defaults. these options are all set by default to the same values in individual metric panes: in the current production deployment of Wikimetrics, start date is the same for all metrics that use start date, include deleted is checked by default for all metrics that offer the checkbox, etc. If they're all pre-populated to the same values in the individual metric panes, why not do the same thing up top in Pick Defaults?

It's just a technical argument. Like, where do those values come from? Do we read all registered metrics and make sure they have the same default value for each parameter? That's ok, I guess, but seems confusing for people in case a new metric shows up that has a different default Start Date. Because then that value would no longer be pre-populated.

Change 217857 merged by Madhuvishy:
Add global default report fields

https://gerrit.wikimedia.org/r/217857