Page MenuHomePhabricator

Allow to filter by Gerrit label 'code review' (on the latest patchset) in the frontend
Closed, ResolvedPublic0 Estimated Story Points

Description

Implementation as of December 2019

On any Gerrit dashboards on https://wikimedia.biterg.io,

  • set status:"NEW" AND status_value:"1" to get data about open changesets with the CR+1 label set for some (!) of its patchsets
  • set status:"NEW" AND status_value:"-1" to get data about open changesets with the CR-1 label set for some (!) of its patchsets
  • set status:"NEW" AND !_exists_:status_value to get data about open changesets without a CR label (CR=0) set for some (!) of its patchsets

Event Timeline

Probably means splitting NEW in the screenshot below into Open && CR0 (Unreviewed), Open && CR+1, Open && CR-1, Open && CR-2, and maybe Open && CR+2. Also wondering if the separate Verified label should be ignored or not, because you may not want to review a CR0 && V-2 patch?

For the records, in the backend, Perceval is currently retrieving that field in the approvals array for a Gerrit change under value, as Kibana's "Discover" shows.

Aklapper raised the priority of this task from Low to Medium.Jun 20 2019, 8:38 AM
Aklapper renamed this task from Provide Gerrit status values (labels) for patchsets in the frontend to Allow to filter by Gerrit label 'code review' in the frontend.Aug 5 2019, 8:55 PM
Aklapper updated the task description. (Show Details)

There is a patch at https://github.com/chaoss/grimoirelab-elk/pull/679 :

In a nutshell, for each changeset, the last approval value that meets the conditions below is retrieved and added to the enriched item as status_value.

  • the approval is not of type Verified
  • the patchset author must be different from the approver

Reporting progress on the backend here:
Looking at https://wikimedia.biterg.io/goto/326470f410eaa0a5fa1808e56ad037ac (url:"https://gerrit.wikimedia.org/r/533713"), comparing with https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Scribunto/+/158323/ one can see the approval_value changing over time:

Now to sort out having this in the frontend and being able to filter on it; wondering if there is a better approach than T224755#5227113.

Quoting Bitergia folks:

[...] there will be a new field status_value showing the latest status of each changeset. From that, you could do such splitting by filtering in NEW changesets and then focusing on status_value.

It is also possible to specify filters at bucket level, so you can then define things like:

  • NEW (+1)
  • NEW (-1)
  • Merged
  • ...

On the Gerrit dashboards, it is now possible to filter on the status_value field (which resembles the Gerrit CR label). Examples are:
status_value:"1" AND status:"NEW" or status_value:"-1" AND status:"NEW". However, still trying to find out how to query for CR=0. Stay tuned.

Alright, after playing a bit with the Gerrit index in "Discover" plus rtfm, status:"NEW" AND !_exists_:status_value seems to resemble CR=0.

Pretty much done, only keeping this open because of a small data inconsistency that I don't understand yet.

only keeping this open because of a small data inconsistency that I don't understand yet.

Quoting non-public conversation (with some minor editing by myself, to understand better):

What I did on 2019-12-17:

Reply by Bitergia in https://gitlab.com/Bitergia/c/Wikimedia/support/issues/61 :

if you look at https://wikimedia.biterg.io/goto/30abf05cefb0ae682bbcd4cc1775c080 you can see that the latest code review for PS19 is -1 (and PS21 has no review yet as of 2019-12-17). The rest of +2 are not code reviews but bot verifications. In fact, we ignore these in the dashboard as they are not code reviews.
If you go to Approvals dashboard (https://wikimedia.biterg.io/goto/32751e02e11dabd05b85c3289bdeb73c) you'll see there are no +2 reviews unless you disable bots filter. That's why the current status for this changeset is -1.

What I replied on 2019-12-20:

Indeed CR-1 was the value set for previous Patch Set 19 (and the last CR value ever set), and the latest Patch Set 21 has no CR value yet on https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Flow/+/477989/ :
So I guess my question is: How to phrase a search query to list all changesets whose latest Patch Set does not have a CR value yet? Asking as I am after identifying changesets whose latest Patch Set is awaiting a code review.

Reply by Bitergia in https://gitlab.com/Bitergia/c/Wikimedia/support/issues/61 :

I'm not sure if this can be done. We face a similar problem when looking for the time to first review:

Time to first review data is already available in the enriched index (time_to_first_review for changeset and patchset_time_to_first_review for patchsets).
Note that in some cases the approval date is earlier than the creation date of the changeset/patchset. This happens because there is no approval after a new patchset. The previous approvals are copied to the new patchset, although they come from the previous patchset (this is not something we do but something that comes from the server). To overcome these cases and avoid negative numbers, we skip approval dates previous to patchset date to calculate the time to first review.

We could look for -patchset_time_to_first_review:* AND type:patchset and that should give us the patchsets without later review (not 100% sure as it gets trickier on each step :P). Something like: https://wikimedia.biterg.io/goto/dd92a4912a3c1c4cecb81dca3855ba1b
Please review this and if you still not find what you need, open a new issue when this specific use case so we can have a thread dedicated to it (in case we need to make any modification in the index to support it)

...and putting my notes from an impromptu meeting with Bitergia at FOSDEM 2020 here, so I don't need to search them again:

Discussed https://gitlab.com/Bitergia/c/Wikimedia/support/issues/61 - Gerrit API returns somehow CR label for first patchset when querying for second patchset - time is from first patchset to actual review. Andre would like to find/filter on latest patchsets without review while previous versions had a review. Maybe to filter on patchsets without any value? Two possible approaches here. Andre to define a better use case. https://phabricator.wikimedia.org/T224755#5813556

Looking into this again and posting my notes in the next comments, so this time my future self might understand my present self:

"Problem" / wrong expectation: Currently, the CR value is the last non-empty value. Given a changeset, the query might show a CR label from a previous patchset version of that changeset, even when the latest patchset has no CR label value. It also ignores CR labels given in previous patchset versions which still apply to the latest patchset. Steps to see this:

Note for internal tracking: Moved from https://gitlab.com/Bitergia/c/Wikimedia/support/-/issues/61 (Allow to filter by Gerrit label 'code review') to https://gitlab.com/Bitergia/c/Wikimedia/support/-/issues/79 (Allow filtering by Gerrit code review label for latest patchset)

I'm still quite puzzled by current behavior. This comment might also make me look slightly stupid if it turns out that I missed something obvious. :)

I took five open changesets in Gerrit. (Reminder: A changeset has one or more patchsets; a new patchset updates the previous patchset.) As of now,

  • AbuseFilter: 80159 always had CR=0 --- Expectation: CR+0.
  • Flow: 477989 with CR+0 in latest PS21, but had CR-1 in PS19. --- Expectation: CR+0
  • MW Core: 394753 has CR+1 in latest PS4 and still CR-1 from PS2 --- Expectation: CR+1 and CR-1.
  • PageForms: 409647 has CR+1 still from PS1 --- Expectation: CR+1
  • Puppet: 390402 has CR+1 in latest PS. --- Expectation: CR+1.

My goal is to e.g. to filter on changesets whose current latest patchset needs a code review (or, inverted, have a code review on the very latest patchset).

Query results seem to always only take the "latest" value which is not empty into account.

Now let's see what we get with different queries..

  • Go to https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit-Timing or https://wikimedia.biterg.io/app/kibana#/dashboard/8c515590-e1de-11e8-8aac-ef7fd4d8cbad , and set timeframe to Last 10 years (yes, ten). "Changesets Only" filter should be enabled by default.
  • Enter (changeset_number:477989 OR changeset_number:80159 OR changeset_number:409647 OR changeset_number:394753 OR changeset_number:390402) AND status_value:"1"
    • Get three items listed: operations/puppet, mediawiki/extensions/PageForms, mediawiki/core
  • Enter (changeset_number:477989 OR changeset_number:80159 OR changeset_number:409647 OR changeset_number:394753 OR changeset_number:390402) AND status_value:"1" AND !(status_value:"-1")
    • Get three items listed: operations/puppet, mediawiki/extensions/PageForms, mediawiki/core
    • [Q] Why is MW Core listed, which has a CR+1 in its latest PS4 but also still has a CR-1 from old PS2? Only latest value picked?
  • Enter (changeset_number:477989 OR changeset_number:80159 OR changeset_number:409647 OR changeset_number:394753 OR changeset_number:390402) AND status_value:"-1"
    • Get one item listed: mediawiki/extensions/Flow
    • [Q] Same as before but inverted: Why is MW Core not listed? Only latest value picked?
  • Enter (changeset_number:477989 OR changeset_number:80159 OR changeset_number:409647 OR changeset_number:394753 OR changeset_number:390402) AND !_exists_:status_value
    • Get one item listed: As expected: AbuseFilter which never had a CR value, but not the Flow item whose latest PS has CR=0 but a previous PS had CR-1.
  • Enter (changeset_number:477989 OR changeset_number:80159 OR changeset_number:409647 OR changeset_number:394753 OR changeset_number:390402) AND -patchset_time_to_first_review:* AND type:patchset
    • Get zero items listed as the default "Changesets Only" filter kind of collides with type:patchset
  • Enter (changeset_number:477989 OR changeset_number:80159 OR changeset_number:409647 OR changeset_number:394753 OR changeset_number:390402) AND -patchset_time_to_first_review:*
    • Get all five items listed.
    • Confusing. [Q] Maybe it checks if for any of the patchsets the patchset_time_to_first_review condition is true (which is always true)?

So I currently don't see a way to query only on the "current" CR status as of the latest patchset for a changeset.
(Also wondering what that means for time to review.)

Aklapper updated the task description. (Show Details)

Quick update: Bitergia were so kind to create a custom dashboard C_last_patchset_first_changeset_enhanced at https://wikimedia.biterg.io/app/kibana#/visualize/edit/52d59ed0-8579-11ea-8a35-09ce380aa6b8

The patchset that don't have a time to first review have status pending, while the rest are labeled as reviewed.

https://github.com/chaoss/grimoirelab-elk/pull/870 ("Propagate changeset status to patchsets") got merged, so the status and status_value attributes of a changeset get copied to its corresponding patchsets. This allows to compare the time to first review for patchsets belonging to a given type of changesets (e.g.,
merged, abandoned or new).

Aklapper renamed this task from Allow to filter by Gerrit label 'code review' in the frontend to Allow to filter by Gerrit label 'code review' (on the latest patchset) in the frontend.May 7 2020, 11:18 AM

Spent a few hours playing with this. Asked a followup question internally but we might have to set up custom visualizations for this IIUC.

My notes from June 12th, please ignore (only posting as I don't want to lose them):

Time to verify (and try to understand all this magic).

  • Reminder: A changeset has at least one patchset.
  • type:"changeset" continues to have the status_value field (e.g. -1) which does not necessarily refer to the last patchset (underlying problem here).
    • Going to https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit-Timing and setting time frame in upper right corner to Last 5 years and keeping Changesets Only and filtering on url:"https://gerrit.wikimedia.org/r/477989" AND status:"NEW" AND status_value:"-1" lists one item. This is what surprised me earlier, as the last patchset (#21) does NOT have CR-1 though we filter on status_value:"-1".
  • As written in the previous comment, the enriched data for type:"patchset" now contains the field changeset_status (e.g. "NEW").
  • type:"changeset" has always had the status field. This is always in sync with the new field changeset_status for type:"patchset".

Data we have in Gerrit for the testcase: Go to https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Flow/+/477989/ and see that

  • that changeset is open
    • in Grimoirelab that means: changeset_status:"NEW" for type:"patchset", and also status:"NEW" for type:"changeset"
  • 21 patchsets are in that changeset
  • the last patchset (version 21) has no Code-Review value
  • 5 patchsets have Code-Review-1 and all are by sbisson
  • 12 patchsets have Verified-1 (but not relevant here)

Behavior on wikimedia.biterg.io: Go to https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit-Timing and set time frame in upper right corner to Last 5 years and disable default Changesets Only and:

  • set url:"https://gerrit.wikimedia.org/r/477989" AND type:"patchset" AND changeset_status: "NEW" (this uses the "new" field)
    • Lists 21 changesets
    • Note to myself: "Changesets" (gerrit_all_changesets) data table is empty here because of type:"patchset"
  • set url:"https://gerrit.wikimedia.org/r/477989" AND changeset_status: "NEW"
    • Lists 27 changesets (22 by takidelfin and 5 by sbisson). "Changesets" (gerrit_all_changesets) data table lists the one item.
  • set url:"https://gerrit.wikimedia.org/r/477989" AND changeset_status: "NEW" AND !_exists_:patchset_time_to_first_review
    • Lists 22 changesets (17 by takidelfin, 5 by sbisson). "Changesets" (gerrit_all_changesets) data table lists the one item.
  • set url:"https://gerrit.wikimedia.org/r/477989" AND changeset_status: "NEW" AND !_exists_:patchset_time_to_first_review AND type:"patchset"
    • Lists 16 changesets (16 by takidelfin) as expected.

So much for confusing myself.

Now try the original test case in T224755#5813556 (Goal: List only those changesets whose last patchset has a CR-1 value):

(Note to myself: Going to https://wikimedia.biterg.io/app/kibana#/discover and querying for changeset_number:"477989" AND patchset_number:"21" (which is the latest patchset in that changeset), the two results show that changeset_status_value is -1 (that's what once upon a time was set on an older patchset but not on this very patchset) and that patchset_time_to_first_review is -.)

To quote myself from the internal ticket:

The use cases are:

  • Which open (NEW) changesets have no CR label on their latest patchset and hence need a review?
  • Which open (NEW) changesets have a CR+1 label on their latest patchset and hence need a merge decision?
  • Which open (NEW) changesets have a CR-1 label on their latest patchset and hence need rework?

Am I correct that achieving these three use cases will always require creating custom new visualizations (like C_last_patchset_first_changeset_enhanced) which have to use top hit aggregation on the patchset_time_to_first_review field, and that there is no way to achieve these three use cases on existing default dashboards (like e.g. Gerrit-Timing) and their default visualizations?

Created a custom dashboard at https://wikimedia.biterg.io/app/kibana#/dashboard/5e903de0-bdd0-11ea-8358-4d35848e335d which lists only open patchsets with CR=0, CR-1 and CR+1, by adjusting some visualizations created by Bitergia (in a non-public internal ticket, thanks!).
Anyone can filter on patch author, author affiliation, and code repository.

Probably welcomes additional overview (sum) statistics but can't remember the exact intention of this task anymore anyway. Shrug.
Feel free to create followup tasks if something else is wanted.

Documented at https://www.mediawiki.org/w/index.php?title=Community_metrics&type=revision&diff=3975591&oldid=3951831