Add most important KPIs to TwoColConflict grafana board
Closed, ResolvedPublic

Description

Motivation:
I want to know if the new TwoColConflict page helps more people to solve their edit conflict than the current, usual one.

Task
Please add the following numbers to the grafana board:

  • How many people saw the TwoColConflict page (in total)
  • How many people clicked on "Publish" on the TwoColConflict page (in total)
  • Percentage of people who clicked on "publish" on the two col conflict page
  • How many people saw the normal edit resolution page (in total)
  • How many people clicked on "Publish" on the normal edit resolution page (in total)
  • Percentage of people who clicked on "publish" on the edit resolution page

N.B. This is a very small subset of ticket T158073: Metrics: TwoColumnEditConflictMerge . However, this is for the broad overview, while T158073 is for a deep analysis (and not expected to be on a grafana board)

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 11 2017, 11:13 AM
Restricted Application added a project: TCB-Team. · View Herald TranscriptJun 6 2017, 3:43 PM

@Addshore @WMDE-Fisch which code would need to be added to the extension as a prerequisite for tracking those numbers?

How many people saw the TwoColConflict page (in total)

I guess this is exactly the same as the 'conflicts' counter that we already have for the extension, so nothing needs to be added for this.

How many people clicked on "Publish" on the TwoColConflict page (in total)

Will need JS tracking added to the button in the extension

Percentage of people who clicked on "publish" on the two col conflict page

It sounds like this can be created using the above 2 numbers.

How many people saw the normal edit resolution page (in total)

AFAIK this is already tracked in graphite through core "MediaWiki.edit.failures.conflict" which is in EditPage.php
This will include the conflicts that result in thw TwoColConflict page being shown.

How many people clicked on "Publish" on the normal edit resolution page (in total)

Would have to add some tracking in either JS or PHP for this. It would make more sense to add this in PHP as of course conflicts can be resolved with no JS enabled and doing the tracking in JS would result in incorrect numbers.

Percentage of people who clicked on "publish" on the edit resolution page

Can be calculated using numbers above.

How many people clicked on "Publish" on the TwoColConflict page (in total)

Will need JS tracking added to the button in the extension

Created T167862 for adding the code.

How many people saw the normal edit resolution page (in total)

AFAIK this is already tracked in graphite through core "MediaWiki.edit.failures.conflict" which is in EditPage.php
This will include the conflicts that result in thw TwoColConflict page being shown.

So, in that case, IIUC, we need a dashboard with "MediaWiki.edit.failures.conflict" but subtract the number of people that saw the TwoColConflict page

How many people clicked on "Publish" on the normal edit resolution page (in total)

Would have to add some tracking in either JS or PHP for this. It would make more sense to add this in PHP as of course conflicts can be resolved with no JS enabled and doing the tracking in JS would result in incorrect numbers.

Created T167863 for adding the code.

How many people saw the normal edit resolution page (in total)

AFAIK this is already tracked in graphite through core "MediaWiki.edit.failures.conflict" which is in EditPage.php
This will include the conflicts that result in thw TwoColConflict page being shown.

So, in that case, IIUC, we need a dashboard with "MediaWiki.edit.failures.conflict" but subtract the number of people that saw the TwoColConflict page

Sounds right.

Patches that log saves with handled edit conflicts are now in place and should be working already. Keys are"

  • edit.failures.conflict.resolved for all successful saves after resolving an edit conflict ( including the two column conflict ones )
  • TwoColConflict.conflict.resolved for successful saves with the two column conflict view

Numbers including the conflict/resolve ration can now be added to the board. @Addshore

@GoranSMilovanovic Can you pick this up and add the missing graphs to the board?

@Tobi_WMDE_SW

The desired counts are now placed on the dashboard, except for that I am having trouble with the divideSeries() function; please take a look at the percent singlestats (second row; the should be proportions in fact - take a look at the metrics for these singlestas - but they are obviously incorrect).

Note: the results are *extremely* sensitive to the value of the Max data points parameter.

I have also removed two BF panels since they were reporting N/A; if you need them back, just remind me what were they representing (BF Disables is still there).

Time range is set to last 30 days, let me know if that needs to be changed.

BF panels restored.

Yay numbers :) Only the percentages don't work yet for me. E.g.: 57 conflicts with TwoColConflict, 31 resolved should be 31/57=54%. Same goes for pages without TwoColConflict

also, please completely separate the number of edit conflicts with and without TwoColConflict. So instead of "Total Edit Conflict Resolution" and "Total edit conflicts" I would like to see the numbers for the current standard page only.

Note: the results are *extremely* sensitive to the value of the Max data points parameter.

What is the Max data points parameter and how does it affect us?

@Lea_WMDE Please read through the previous comments.

As soon as the usage of Graphite functions in Grafana issue is resolved (@Addshore and I had a related discussion on IRC today), the numbers will do just fine, i.e. basic arithmetic that you are asking for will be implemented. Until then you will have to be patient; as ever, we are doing everything by-the-documentation - and obviously there's a tweak that we need to discover and implement.

@Lea_WMDE

What is the Max data points parameter and how does it affect us?

Max data points

Every graphite request is issued with a maxDataPoints parameter
Graphite uses this parameter to consolidate the real number of values down to this number
If there are more real values, then by default they will be consolidated using averages
This could hide real peaks and max values in your series
You can change how point consolidation is made using the consolidateBy graphite function
Point consolidation will effect series legend values (min,max,total,current)
If you override maxDataPoint and set a high value performance can be severely effected

@Lea_WMDE @Tobi_WMDE_SW

I think that something is seriously wrong with Grafana (or Graphite - I cannot tell). Speaking of the Grafana singlestats that have "Percent" in their titles:

  1. I am certainly using the Graphite functions to compute a ratio - divideSeries() - correctly.
  1. The result, which should be a simple ratio of two numbers (technically speaking: a ratio of two series, each summarized by the count() function, which gives only one number per series), is never correct.
  1. Moreover, no value of Max data points will *ever* match the expected result. So, this parameter affects the result, but in a totally non-transparent way.

Following hours of search through the Graphite and Grafana documentation + StackOverflow, I was not able to find any reason why this would happen. The same holds for the diffSeries() function that we would need to use to subtract the TwoColConflict resolutions from the total count.

Note: there are many unpredictable behaviors there that I have encountered. Just for example, changing the value of the 'stat' parameter from 'total' to 'current' and back didn't return the initially observed (incorrect, of course) result. It has to nothing to do with saving the dashboard or not - I'm working very carefully with it, saving every change. After some time spent working with these 'Percent' singlestats on the TwoColConflict dashboard, the value of what should be a proportion settled to 1 for both singlestats and since then no parameter tuning (including attempts to set insane values to Max data points) can change it at all.

Sorry - but I don't think I can help here.

@GoranSMilovanovic :( But thanks for your work anyways! No idea if it helps, but according to the edit conflict grafana board we have 1000-2000 edit conflicts per day. So the numbers that should make up the percentage seem to be wrong, too.

@Lea_WMDE I've used in Grafana exactly the tracking as implemented (https://phabricator.wikimedia.org/T167863) and provided above:

Patches that log saves with handled edit conflicts are now in place and should be working already. Keys are"

edit.failures.conflict.resolved for all successful saves after resolving an edit conflict ( including the two column conflict ones )
TwoColConflict.conflict.resolved for successful saves with the two column conflict view

Numbers including the conflict/resolve ration can now be added to the board.

Let me check how the edit conflict grafana board works; maybe that will help.

@Lea_WMDE @Tobi_WMDE_SW

I am sorry to report this, but something is definitely wrong with your Grafana dashboards.

Just a minute ago, I have defined *exactly* the same - a parameter by parameter match - singlestat on TwoColConflicts dashboard from the edit conflict grafana dashboard. While it reports back some real numbers on the edit conflict board, it returns N/A on TwoColConflicts.

I won't be able to help out / look at this for the next days (vacation, woo) but will be able to check back in this ticket on the 27th.

I won't be able to help out / look at this for the next days (vacation, woo) but will be able to check back in this ticket on the 27th.

@Addshore Enjoy the summer. In the meanwhile, I will do what I can to make this work, but from what I've seen thus far... it doesn't look promising. I hesitate to start reading on Graphite/Grafana systematically since Grafana is fairly simple while Graphite would take me an introductory course to get to know it.

@GoranSMilovanovic thanks for all your efforts so far! Then let's wait until next week when Adam is back, maybe he can shed some light into graphana darkness...

So, I have just updated this dashboard.

I have changed .count to .sum in many of the metrics, for our case this doesn't actually make a difference to the numbers as we only ever increment the metrics with a value of 1. But as .sum is 'correct' I have made the change.

The single stats for percentages used to have a value type of "current".
As the data is recorded per minute this would only then use the last 1 minute of data available.
This has now changed to the metrics within the single stats using the "summarize" graphite method which means there is now only 1 data point for the day before then being divided, thus spitting out a %.

I have also made the 2 line graphs use the time range specified for the dashboard rather than a hard coded 30d time range.
This allows the users of the dashboard to more easily look at past data.

NOTE: the single stats are still all locked to a time period of 1d meaning the last 1 day.

I have also added a line chart to the bottom of the dashboard showing the % conflict resolutions in mediawiki using the TwoColConflict extension vs the default solution.
This also adjusts with the dashboard time range so old data can be looked at.

Addshore moved this task from Backlog to Needs Review on the User-Addshore board.

@Addshore Thanks for picking this up!

Addshore closed this task as Resolved.Aug 21 2017, 9:10 AM

I'm going to close this as done, although I guess it still needs final review & singoff