Page MenuHomePhabricator

Grafana board for TemplateData
Open, Needs TriagePublic1 Estimated Story Points

Description

Create Grafana board for TemplateData metrics described in T260343

Event Timeline

Lena_WMDE set the point value for this task to 2.
awight changed the task status from Open to Stalled.Dec 18 2020, 2:44 PM
awight changed the task status from Stalled to Open.Jan 6 2021, 9:41 AM
Lena_WMDE changed the point value for this task from 2 to 5.Jan 6 2021, 9:46 AM

Hey @Andrew-WMDE thanks for sharing the board for review. I have a few questions, some which came out of the discussions about the CodeMirror board and some which are related to just TemplateData.

General

  • I think I caught a small typo: fr should be fa for Farsi, instead of French (unless it's actually collecting data from fr.wikipedia?)
  • Clarifying that at the moment 'all' means all four wikis, and not all wikis? If correct, we should talk about which numbers it's possible to display a number for all wikis collectively and how much work it is to add this. For CodeMirror we added this on a separate ticket. Maybe we should do the same here.
  • Find the edit count breakdown a bit funny. Why is it 'over 10' and 'under 11'?

Dialog section

  • Can you add a definition of save and abandon? I think we decided to define 'save' as saves the dialog, but this doesn't necessarily include also saving the page after. Would be helpful to document here.
  • If it's not much work, would appreciate a third graph which compares users adding new template template and users editing existing templates. This could be only the successful (save) interactions, or ideally a combination of both numbers to see how many people attempt each action in total and relative to each other.

Template

  • Want to clarify for each action (also param actions): this is tracking if someone interacted with a field, but not necessarily if it's saved right? Would be nice to add this to the intro sections for clarity.

Parameter

  • Parameter green bar graph at top left: Is this an average per day for all days we have been collecting data? If yes, would be good to add this to the description under the 'i' button.
  • Parameter line graph to the right: I think this is broken down in too many pieces to be legible. I would make this analogous to the green one on the left, but simply showing the changes over time. So it would only be broken up by action and lump all wikis and editor counts together. We can then filter at the top if we want to see it broken down.
  • Individual actions: these don't seem to be pulling in the data correctly? The 2nd row doesn't show anything for me, but I guess this is supposed to be the collective number, which is then broken down by editor count and wiki on the right?
  • The number counter on the left side for each action, when I click on the 'i' says it's a percentage of all sessions where this action is edited. I would show these as a percentage using something like a pie chart. Also why are some over 100%?
  • For type, does it show no data because no one has edited the type since we started collecting?

Let me know if any of these questions are confusingly worded. Also open to a call to go through them (I know it's a lot..)

Hey @Andrew-WMDE thanks for sharing the board for review. I have a few questions, some which came out of the discussions about the CodeMirror board and some which are related to just TemplateData.

General

  • I think I caught a small typo: fr should be fa for Farsi, instead of French (unless it's actually collecting data from fr.wikipedia?)

Yes, that's a typo I changed it from frwiki to fawiki. If you want we can additionally include frwiki in the list?

  • Clarifying that at the moment 'all' means all four wikis, and not all wikis? If correct, we should talk about which numbers it's possible to display a number for all wikis collectively and how much work it is to add this. For CodeMirror we added this on a separate ticket. Maybe we should do the same here.

So for TemplateData we are already collecting metrics from all wikis. Therefore, I have updated the all wiki filter to now include all wikis not just the four.

  • Find the edit count breakdown a bit funny. Why is it 'over 10' and 'under 11'?

Isn't this the breakdown we wanted? T260343

Dialog section

  • Can you add a definition of save and abandon? I think we decided to define 'save' as saves the dialog, but this doesn't necessarily include also saving the page after. Would be helpful to document here.

Done.

  • If it's not much work, would appreciate a third graph which compares users adding new template template and users editing existing templates. This could be only the successful (save) interactions, or ideally a combination of both numbers to see how many people attempt each action in total and relative to each other.

Done.

Template

  • Want to clarify for each action (also param actions): this is tracking if someone interacted with a field, but not necessarily if it's saved right? Would be nice to add this to the intro sections for clarity.

Yes, we don't take into account whether the flow was saved or abandoned for parameter and template metrics. This is now also documented.

Parameter

  • Parameter green bar graph at top left: Is this an average per day for all days we have been collecting data? If yes, would be good to add this to the description under the 'i' button.

Done.

  • Parameter line graph to the right: I think this is broken down in too many pieces to be legible. I would make this analogous to the green one on the left, but simply showing the changes over time. So it would only be broken up by action and lump all wikis and editor counts together. We can then filter at the top if we want to see it broken down.

Done.

  • Individual actions: these don't seem to be pulling in the data correctly? The 2nd row doesn't show anything for me, but I guess this is supposed to be the collective number, which is then broken down by editor count and wiki on the right?

Same as below except this time grouped by wiki.

  • The number counter on the left side for each action, when I click on the 'i' says it's a percentage of all sessions where this action is edited. I would show these as a percentage using something like a pie chart. Also why are some over 100%?

It's shows the percentage of all sessions related to that edit count group. So a value of 5% would be "In 5% of {anonymous} user's sessions {template-description-change} was invoked at least once."

  • For type, does it show no data because no one has edited the type since we started collecting?

As of now that should be the case, however, there was initially a bug which resulted in type changes not being recorded when creating new templates.
This is now fixed by https://gerrit.wikimedia.org/r/c/mediawiki/extensions/TemplateData/+/655063.

Let me know if any of these questions are confusingly worded. Also open to a call to go through them (I know it's a lot..)

Overall looking good!

Yes, that's a typo I changed it from frwiki to fawiki. If you want we can additionally include frwiki in the list?

Thanks. Not needed to add frwiki. Once we know the small default wikis, we'll probably adjust the list then.

So for TemplateData we are already collecting metrics from all wikis. Therefore, I have updated the all wiki filter to now include all wikis not just the four.

That's great! Thanks for updating.

Find the edit count breakdown a bit funny. Why is it 'over 10' and 'under 11'?

Isn't this the breakdown we wanted? T260343

Ah ok I see this was just the way that was read. I would think that would be "10 or less" and 11-100." Anyway, wouldn't spend any time adjusting this because Adam is changing the buckets in this ticket T269986: Add edit count bucketing to all metrics


Also it seems like you changed the way that the percentage of dialog sessions when {some action} is performed are displayed, but something seems off. This is how they all look to me:

Maybe this has to do with breaking down by wiki? If so, I think this can be combined - for example, of all dialog sessions on all wikis, the description was changed 10% of the time.

For some of them, there is data because it's showing on the green visuals but to the right it says no data. Any idea why?

The board is looking great!

One additional question:
What is the purpose of the filters for parameter action and template action at the top? Selecting any of the options doesn't seem to change any of the charts for me and I'm not sure this is useful as the actions are represented in their own charts below.

Open items from review:

  • Adding a readable graph for average of data for all wikis, showing percentage of sessions where action happens, per action
  • Updating percentage data with 'speedometer' graph to communicate percentage-ness

Open items from review:

  • Adding a readable graph for average of data for all wikis, showing percentage of sessions where action happens, per action

Done

  • Updating percentage data with 'speedometer' graph to communicate percentage-ness

Done

Also it seems like you changed the way that the percentage of dialog sessions when {some action} is performed are displayed, but something seems off. This is how they all look to me:

Maybe this has to do with breaking down by wiki? If so, I think this can be combined - for example, of all dialog sessions on all wikis, the description was changed 10% of the time.

Done

For some of them, there is data because it's showing on the green visuals but to the right it says no data. Any idea why?

Unfortunately, I believe this is caused by the queries sporadically timing out. Refreshing the page and narrowing down the filters might help.

What is the purpose of the filters for parameter action and template action at the top? Selecting any of the options doesn't seem to change any of the charts for me and I'm not sure this is useful as the actions are represented in their own charts below.

Choosing a certain combination of parameter and template filters will reduce the number of rows generated for each respective section. This should help speed up the rate at which the graphs are rendered while also reducing the likelihood of a timeout.

Lena_WMDE changed the point value for this task from 5 to 1.

Some review notes,

  • Dashboard should be named "TemplateData dialog"
  • Let's drop the template action filter for now, since there is only one item.
  • While we're giving Information, how about describing the "edit count", "parameter action" and "template action" filters as well?
  • Nice that the wiki filter affects all graphs.
  • "Dialog opens" graphs are actually "Dialog closes by success" or something. "Dialog success"?
  • "Dialog" section line graphs should all be normalized. There's no need to plot abandonment, instead just show success rate: max(0, success / (success + abandoned)) or null if 0.
  • Pie charts are good as-is. Should show failures even when zero.
  • Would like to have a graph comparing success rate across wikis. Maybe show the top 10 best wikis by mean success rate, and the 10 worst?
  • The grid with 100s of wikis don't seem useful. How about two line graphs, of the top and bottom 10?
  • Template-description-change by wiki as a line graph is also hard to interpret. I don't have an immediate suggestion.
  • Panel names should indicate what changes between each. For example, "Template: template-description-change" is "Overall template-description-change rate", "template-description-change per wiki" (x2), and "template-description-change by edit count"
  • Parameter interactions main information pane should explain that these events are counted only once per dialog, same as template interactions.
  • I think "Average number of parameter interactions per day" should be normalized (maybe spec'ed as a follow-up?). It's usable as-is, but a percentage would be better IMO.
  • These low-frequency events need to be averaged over long periods to be useful. For example, the template-description-change by edit count could be a bar graph of the mean number of events over the time period, of each edit count bucket population relative to one another. Normalized... So we can see, for example: "over1k users are twice as likely to edit template description than anonymous users"... "Average number of parameter interactions per day" as a bar graph seems like a good pattern to reuse in other panels, IMHO.

@Lena_WMDE I'm not sure where this ticket should go next. The review notes above are still unaddressed, I think we're waiting for you and @ECohen_WMDE to decide whether any of these points are relevant. Maybe this is in Demo?