Page MenuHomePhabricator

"Age of unreviewed changesets by affiliation" shows negative number of changesets
Closed, ResolvedPublic

Description

In "Age of unreviewed changesets by affiliation" at http://korma.wmflabs.org/browser/gerrit_review_queue.html, the popup window shows the number of patches counted in the median for each affiliation. However, many of the numbers provided are negative, which doesn't make any sense.

I had already reported via email that those numbers looked very low considering the amount of changesets uploaded every month. This was when the numbers were still positive. Seeing negative values suggests that there is indeed something wrong there.


Version: unspecified
Severity: normal
URL: http://korma.wmflabs.org/browser/gerrit_review_queue.html

Details

Reference
bz70600

Related Objects

StatusSubtypeAssignedTask
DuplicateQgil
ResolvedQgil
ResolvedQgil
InvalidNone
InvalidNone
Resolved Aklapper
DeclinedNone
DeclinedNone
OpenNone
ResolvedQgil
ResolvedQgil
ResolvedQgil
ResolvedQgil
Resolved Aklapper
ResolvedNone
Resolved Aklapper
Resolved Aklapper
ResolvedDicortazar

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:45 AM
bzimport set Reference to bz70600.
bzimport added a subscriber: Unknown Object (MLST).

Marking High importance because I would really like to leave the Gerrit Review Queue page absolutely clean.

In fact, this graph should show the media age of their most recent uploads, just like we did with "Age of open changesets" and "Ranking of repositories".

Finally, could you change "scr_review_time_pending_ReviewsWaitingForReviewer_days_acc_median" for a human readable string, please?

Ok, the new metric (upload) is generated also for companies, so we can move to this metric and then, fix the number of open issues per company using this new metric.

The new metric has already a human label, so the other error will be fixed also.

As soon as it is available I will update this ticket.

Quim, problem fixed. Now the pending number per company is right. The problem was that we are showing the new net pending per month, not the total pendings.

Now there are no negative numbers, but they still look low. For August we have:

Unknown: 66
Wikimedia Foundation: 55
Independent: 123
Wikimedia Deutschland: 29
TOTAL: 273

However, according to "Volume of open changesets", in August there are 765 waiting for review.

765 - 273 = 492

Where are these 492 changesets missing?

By the way, now we have "scr_review_time_pending_upload_ReviewsWaitingForReviewer_days_acc_median", which is still not very readable.

Álvaro, when do you think we will have a solution for this problem? This is the last bug in http://korma.wmflabs.org/browser/gerrit_review_queue.html stopping me from promoting this page widely.

I plan to review all labels and help texts today and next, work on this issue. So expect a solution, or an explanation of the numbers, next two days!

Cheers

Quim, the number of reviews now show the correct number.

The main problem was that there are new identities not linked to Unknown.

With this fixed the numbers are much closer.

We have changed the metric for affiliations to "Time from last patchset". Is it ok or you prefer "Time from submission"?

With this metric:

Unknown: 666
WMF: 36
Independent: 10
WMD: 55
WikiWorks: 0

total: 767.

According to the above graph, "Time from last patchset" number of revisions should be 816. For other dates the error is less.

A deeper analysis could be done, but I am pretty confident the data is correct.

Progress!

It is strange to find so many Unknown. Any developer with a @wikimedia.org email address is a WMF employee. Any developer with a @wikimedia.de address is a WMDE employee. Are we applying these rules?

Quim, reviewed the mapping, now we have:

Unknown 479
WMF 206
Individual 124
WMD 8
Wikia 26

total: 843.

We need to improve the mapping to have less Unknown people in any case.

Thanks. Hm, after this change, Wikia has a very different line from all the rest. Are we still looking at "waiting for reviewer"? If so, this is worth looking at. Before there was a link to the raw data, but now I can't find it. Could you paste somewhere the gerrit change numbers that are open and owned by Wikia?

Also, if you paste a list of the unknown at https://www.mediawiki.org/wiki/Talk:Community_metrics , we will help identifying the affiliation of these contributors.

Qgil lowered the priority of this task from High to Low.Jan 8 2015, 11:25 AM
Qgil raised the priority of this task from Low to Medium.Mar 13 2015, 1:27 PM
Qgil added a subscriber: Dicortazar.

Ok, here the suspicion was that the values of previous months changed as the graph was updated at the end of each month. Silly me, I didn't capture the current data to check those suspected changes... I have captured the data of the last 6 months now.

The graph is stuck in January. @Dicortazar, can you kick it so it is updated to February, please? Then I will check whether the current values for previous months stay. If everything is correct, we can resolve this task.

Those are in February indeed.

Do you mind double checking this? In addition, there's also information for March (incomplete), although given that there are two lines in the chart, we typically wait till the end of the month to show such info.

Those are in February indeed.

In my browsers the graph ends in January:

Screenshot_from_2015-03-17_12:33:48.png (900×1 px, 159 KB)

Also in mine. Checking this!. I though you referred to a previous one.

In my browsers the graph ends in January:

... and this is the case still today. This task was planned for March. Do you have an estimated delivery time?

The dataset is now updated, but the pop-up does not work. Working on this now.

As an estimation, this should be done by the end of April.

There are still negative numbers.

And also: the title of this graph says "affilation", "scr_review_time_pending_upload_ReviewsWaitingForReviewer_days_acc_median" should be a human-readable string, and a "?" explaining the details of the metric should be added.

The WMF Annual Plan 2015-16 includes a goal related to this task:

Set and monitor code review KPIs for all community-sourced contributions

Solving this task will help identifying "community-sourced" contributions.

http://korma.wmflabs.org/browser/gerrit_review_queue.html

Fixed the name of the metric and the number of pending changesets per affiliation. No more negative numbers (we were showing the net changesets pending, not the total changesets pending used to compute the age).

The help is not shown yet ... we are debugging a problem with it.

In T72600#1495009, @Acs wrote:

http://korma.wmflabs.org/browser/gerrit_review_queue.html

Fixed the name of the metric and the number of pending changesets per affiliation. No more negative numbers (we were showing the net changesets pending, not the total changesets pending used to compute the age).

The help is not shown yet ... we are debugging a problem with it.

'GitHub' in the list of affiliations for "Age of open changesets by affilation (monthly snapshot)" looks ... wrong, and especially bad that GitHub is increasing to be the second highest in May 2015...?

Sorry, I have to reopen. In July, FSF has -2 and Deutsche Telekom has -1. This means that the algorithm is still wrong.

Note that if T100189 would have been fixed, we would have probably not seen these negative numbers... How can we trust that WMF is today "49.4"?

This is an important problem.

@Dicortazar: Any news here when it comes to fixing the underlying problem?

Aklapper raised the priority of this task from Medium to High.Aug 28 2015, 9:36 AM

@Dicortazar: Any news here when it comes to fixing the underlying problem?

Fixed. The problem was with the library version for viz. Updated to the last version, including the fix already done for this issue, and now all is working in

http://korma.wmflabs.org/browser/gerrit_review_queue.html

In T72600#1588421, @Acs wrote:

Fixed. The problem was with the library version for viz.

Thanks a lot! I'm closing this task as resolved.

I've filed some followup tasks for other smaller/minor issues I ran into with that chart: