Page MenuHomePhabricator

Create grafana panels for user conflicts, bucketed by edit count
Closed, ResolvedPublic3 Estimated Story Points

Description

In T236886: Track numbers based on the users experience level we split conflict metrics out by user experience, in number of edits. Create new graphs to show this information, probably in the same conflict dashboard as the older metrics.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
awight set the point value for this task to 3.Nov 6 2019, 12:50 PM
awight renamed this task from Create grafana panels for new user-tenure bucketed conflict data to Create grafana panels for user conflicts, bucketed by edit count.Nov 8 2019, 10:56 AM
awight updated the task description. (Show Details)

Something's wrong here, for example I see percentages well over 100% for the anon group. Can you spell out what the math is supposed to be doing?

Other minor suggestions:

  • Would be nice to have a legend, maybe with the average for each group over the timespan.
  • It might be impossible, but can be groups be given names which naturally collate in an order that makes sense? We should have thought of this at coding time, it might be annoying now.
  • "User Contribution Score" is usually "by Editor Experience"
  • Null value handling (Display tab) should probably be "zero" or "connected", depending on the metric. Otherwise we have discontinuities.
  • aliasByNode can usually be aliasByMetric, then you don't have to specify the node position.

Taking into account the metrics added in T236886 (editor experience = anon, under11, over10, over100, or over200) we can answer the following questions:

  1. What % of edit conflicts experienced using TwoColConflict are resolved, grouping by editor experience?
  2. What % of edit conflicts experienced using the default interface are resolved, grouping by editor experience?
  3. What % of edit conflicts experienced using TwoColConflict are left unresolved, grouping by editor experience?
  4. What % of edit conflicts experienced using the default interface are left unresolved, grouping by editor experience?

Corresponding graphs:

  1. https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?refresh=5m&panelId=36&fullscreen&orgId=1
  2. https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?refresh=5m&panelId=42&fullscreen&orgId=1
  3. https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?refresh=5m&panelId=44&fullscreen&orgId=1
  4. https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?refresh=5m&panelId=43&fullscreen&orgId=1

Example calculations for anon:

For 1. and 2.:
% of conflicts by anon that are resolved = # of conflicts resolved by anon / # of all conflicts by anon

For 3. and 4.:

# of unresolved conflicts by anon = # of all conflicts by anon - # of resolved conflicts by anon
% of conflicts by anon that are left unresolved = # of unresolved conflicts by anon / # of all conflicts by anon

Also included two extra graphs which show:

The total number of conflicts using the default view, grouped by editor experience:

  1. https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?refresh=5m&orgId=1&panelId=40&fullscreen

The total number of conflicts using TwoColConflict, grouped by editor experience:

  1. https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?refresh=5m&orgId=1&panelId=41&fullscreen

In order for the %'s in graphs 1,3(2,3) to be reliable there needs to be a sufficient amount of conflicts for the corresponding editor experience in graph 6(5).
e.g. if graph 1 shows 30% for under100 then in order for that to be a reliable estimate graph 6 should have >~30 conflicts, in the same time period, for under100

Something's wrong here, for example I see percentages well over 100% for the anon group. Can you spell out what the math is supposed to be doing?

Seemed to be some weird glitch with Grafana/Graphite, I don't believe the problem still occurs with the new graphs.

Other minor suggestions:

  • Would be nice to have a legend, maybe with the average for each group over the timespan.

Done

  • It might be impossible, but can be groups be given names which naturally collate in an order that makes sense? We should have thought of this at coding time, it might be annoying now.

I don't know of any easy way way that wouldn't affect performance and/or readability

  • "User Contribution Score" is usually "by Editor Experience"

Done

  • Null value handling (Display tab) should probably be "zero" or "connected", depending on the metric. Otherwise we have discontinuities.

Done

  • aliasByNode can usually be aliasByMetric, then you don't have to specify the node position.

Using aliasByMetric after asPercent appears to result in an empty alias so I opted for aliasByNode

Oooh dang, 15% improvement in conflicts resolved for the most reliable, over200 group!

@awight Can this be marked as resolved or do you still want to have another look?

Notes from acceptance review:

  • Only very experienced editors are using the TwoColConflict interface.
  • Outcomes look much better for the over200 group using TwoColConflict, but we can't tell if we're seeing selection bias. For example, the over100 "default view" group is getting better outcomes than over200, probably attributable to choosing different conflicts to work on.
  • The sample size for all editor demographics < 200 are too small to give us any information. Probably worth checking again after we make this the "small" default, so we can tell whether anonymous users are differentially impacted.
  • We noticed that we can't slice the graphs by namespace.
  • Naming of "default" needs to change as we make TwoColConflict the new default. "Legacy conflict workflow?"
  • Rename the editor experience buckets, e.g. "under11" should read "between 0 and 10 edits".
  • Top-line indicator for percent of conflicts resolved seems to be wrong. Should be closer to the over200 percentage, e.g. 20%. Looks like we're taking the average of averages rather than summing all data.

Top-line indicator for percent of conflicts resolved seems to be wrong. Should be closer to the over200 percentage, e.g. 20%. Looks like we're taking the average of averages rather than summing all data.

This looks correct now, so not sure what I was seeing earlier.