Allow excluding bot responses (e.g. jenkins-bot V+2) when calculating time to first review for Gerrit patches
Closed, InvalidPublic
Actions

Assigned To

None

Authored By

	Aklapper
	Jan 16 2020, 1:06 PM

Description

In a recent discussion about developer productivity metrics, we looked at the gerrit_main_numbers visualization on the Gerrit Timing dashboard (same on Gerrit Overview), which displays the field time_to_first_review labeled "Median Time First Review (Days)".
We were surprised about the good values.
We wondered if it might measure how fast our CI is instead of the actual median time of a first human review. Looking at get_time_first_review and get_time_first_review_patchset in grimoire_elk/enriched/gerrit.py does not exclude bots.

Then we found that the value reviewer_bot exists (defined as ./grimoirelab-elk/schema/gerrit.csv:reviewer_bot,boolean,true,"True if the given patchset reviewer is identified as a bot.")
Filtering on reviewer_bot:false on the Gerrit Timing dashboard showed "No results found" and left all widgets empty.
After disabling the default Changesets only filter, data is displayed. However the item "Median Time First Review (Days)" remains completely empty.

Related Objects
Search...

Status	Subtype	Assigned	Task
Open	Feature	None	T248810 Create custom dashboard to visualize code review responsiveness per engineering team
Invalid		None	T242964 Allow excluding bot responses (e.g. jenkins-bot V+2) when calculating time to first review for Gerrit patches
Resolved		Aklapper	T190251 Offer "time to first review" data for patches

Event Timeline

Please ignore: Notes to myself:

gerrit_median_time_open_gauge visualization on the Gerrit Efficiency dashboard is instead about merge (not: review) and displays the field painless_delay as a median. For the definition of painless_delay, see T224760#5732232.
Time to first review data is in the enriched Gerrit index: time_to_first_review for changesets; patchset_time_to_first_review for patchsets). (quoted from internal https://gitlab.com/Bitergia/c/Wikimedia/support/issues/62#note_239688407 )

Task description all looks right to me. I guess this task is based on a couple of assumptions of mine that I'd like to lay out:

time_to_first_review is a field on an "enriched" field that is added to an item before entry into elasticsearch
The corollary to the first assumption (in my mind anyway): any filtering done via kibana will NOT change the time_to_first_review of a given patch
- that is, adding "author_bots is not True" filter will not change the aggregate metrics of time_to_first_review EXCEPT that any patches made by bots will not be represented
The enrichment being applied to a patchset to derive time_to_first_review is the one I found on github
The enrichment's qualifier for what counts as a first review is an approval not by the owner of a patchset (see again github)
A V+2 with a comment counts as an approval

If all the above is correct (which it might not be in some material way) it is probably skewing our numbers a bit as the jenkins-bot user leaves a V+2 with a comment on many of our repos that are setup for CI.

Aklapper added a parent task: T248810: Create custom dashboard to visualize code review responsiveness per engineering team.Mar 30 2020, 3:44 AM

Aklapper mentioned this in T248811: WMF to possibly have a meeting with Bitergia to discuss bigger long-term plans.Mar 30 2020, 3:34 PM

Aklapper added a subtask: T190251: Offer "time to first review" data for patches.Apr 16 2020, 4:17 PM

Aklapper renamed this task from How to exclude non-human bot responses when calculating time to first review for Gerrit patches to Allow excluding bot responses (e.g. jenkins-bot V+2) when calculating time to first review for Gerrit patches.Apr 17 2020, 4:58 PM

Aklapper triaged this task as Medium priority.

For the records, I created an internal non-public ticket for clarification with Bitergia as https://gitlab.com/Bitergia/c/Wikimedia/support/-/issues/78

@thcipriani: Looks like our assumptions are incorrect.

Quoting non-public https://gitlab.com/Bitergia/c/Wikimedia/support/-/issues/78 , in short:

Only the approvals of type code review are considered. In other words, bots are excluded with the condition if approval['type'] != CODE_REVIEW_TYPE: continue in https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/enriched/gerrit.py#L540 and https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/enriched/gerrit.py#L581

Regarding filtering on reviewer_bot:false not working:

This is due to the fact that the index consumed by the dashboard contains different types of documents: comment, approval, patchset and changeset.
The dashboard https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit shows information about the changesets, and the visualizations in it are tailored to this type of documents. reviewer_bot exists only on comments: https://github.com/chaoss/grimoirelab-elk/blob/master/grimoire_elk/enriched/gerrit.py#L327.

Interesting! It sounds like these numbers that seemed to good to be true are accurate.

For fun I was able to confirm at least one patchset: https://gerrit.wikimedia.org/r/589370/ shows a patchset_time_to_first_review of 0.09.

Looking at the git -C mediawiki/tools/release.git log --format=raw refs/changes/70/589370/meta I see a CR+2 at 1587066393 and a creation time at 1587058288:

units $(calc '1587066393-1587058288' | head -1)seconds days
	* 0.09380787

I guess we can call this task invalid since we already exclude bots.

Allow excluding bot responses (e.g. jenkins-bot V+2) when calculating time to first review for Gerrit patchesClosed, InvalidPublicActions

Description

Related ObjectsSearch...

Event Timeline

Allow excluding bot responses (e.g. jenkins-bot V+2) when calculating time to first review for Gerrit patches
Closed, InvalidPublic
Actions

Related Objects
Search...