Page MenuHomePhabricator

Define code review metrics
Open, Needs TriagePublic

Description

As per the Code Review work group meeting discussions, we'd like to define a set of metrics (probably one or two to start with) to help us establish a baseline to improve upon. Although these metrics are focused primarily on responsiveness, we want to insure that we encourage sound code review processes and outcomes as well.

Behaviors to be mindful of:

  • Quick initial response with slow follow-through.
  • No response from patch submitter.
  • Merging everything in order to be "responsive".
  • ...

Ultimately these metrics will be available on some form of Code Stewardship dashboard to help Code Stewards manage review requests and resources.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 31 2019, 11:02 PM

For the Clinic Duty team, I'm interested in looking at some of these metrics:

  • Origin point of CR requests
  • CPT members added as reviewers
  • When CPT added as a reviewer vs when first action was taken
  • When CR task lands in CPT inbox vs first action taken
  • Time to first review
  • Turn around between review being added and update to changeset? Updated/Abandoned/
  • Average number of rounds of CR per changeset
  • Unique priorities of tasks associated with CRs
  • CRs broken down by code modules

An important question for me is where the CRs are originating but also what their content is, which would play into the number of rounds of CR needed. Are they complex features that require feedback, more straightforward bugs or if simple bugs are ending up spawning conversations within the CR. Ideally, I would like to be able to capture generally the chronology of a CR and its associated task as well as understanding what code modules are being looked. Ultimately, it would be really interesting to capture data on CRs that had bugs associated with them after merge.

I'd be curious if anyone has thoughts on these or suggestions of other metrics that might be worth targeting.

For the records, regarding https://wikimedia.biterg.io, the following additional functionality for the Gerrit stats are planned to be added in the next 12 months:

  • Offer "time to first review" data for Gerrit patchsets
  • Offer "Avg. Time Open (Days)" average value for Gerrit patchsets per affiliation/organization
  • Allow to filter by Gerrit label 'code review' in the frontend
  • Support Gerrit as a data source in the rest of the dashboards under the "Community" Tab (and not only Git)

For discussion and comments, please use the specific task(s): https://phabricator.wikimedia.org/maniphest/?ids=190251,224755,225894,224760#R

On a related note: For those who like to play with APIs: I've now documented using the REST API of wikimedia.biterg.io at https://www.mediawiki.org/wiki/Community_metrics#REST_API (which might be an alternative to playing with the Gerrit API). If anyone is interested, drop me a line (as it requires authentication).

Awesome, thank you! I'll take a look at that today

Jdforrester-WMF added a comment.EditedFri, Aug 30, 9:08 PM

As I promised, here's my first cut proposal just to get the ball rolling:

Proposed first cut of metrics

MetricPurposeCutsExample question
Submitters of patchesRaw development participation / throughput countBy project, submitter, submitter class, submitter group, time windowHow many people have written patches in the past month?
Patches submittedRaw development participation / throughput countBy project, submitter, submitter class, submitter group, time windowHow many patches are being written by WMDE staff each week?
Patches with a responseReview healthBy project, submitter, submitter class, submitter group, comment type, initial comment lag, last comment lag, time windowHow long do patches written by WMF staff take before getting review from WMF staff?
Patches with an outcomeReview healthBy project, submitter, submitter class, submitter group, outcome type, initial comment lag, commenter / group / class, outcome lag, outcome decider / group / class, time windowHow long do patches written by volunteers take before getting merged by WMDE staff?
CutMeaningExamples
time windowCut of this data by arbitrary time windowtime window
projectWhat repo?MediaWiki; Vector; Puppet; Config
project classWhat kind of repo?Maintained code; Maintained config; Unmaintained
submitterWho wrote the patchJdforrester
submitter classAffiliation of submitterWMF; WMDE; Hallo Welt!; Independent; …
submitter groupSub-affiliation of submitter where appropriateParsing; Wikidata; Fundraising; SRE; N/A
initial comment lagTime between submission and first response ("Time to first review")time span
latest comment lagTime since latest non-submitter commenttime span
commenterWho is the first commenter?Jdforrester
commenter groupAffiliation of commenteras above
commenter classSub-affiliation of commenteras above
outcome lagTime from submission to outcome ("Time to merge")time span
outcome typeWhat happened to the changeStill open; Abandoned; Merged
deciderWho is the decider?Jdforrester
decider groupAffiliation of decideras above
decider classSub-affiliation of decideras above

Proposed first cut of KPIs

  • Stable development practices
    • The number of patches submitted should be roughly stable from month to month.
    • The number of patches reviewed should be roughly stable from month to month.
    • The number of patches merged should be roughly stable from month to month.
  • Growing contributor community
    • The number of patch submitters should gently increase over time.
    • The number of patch reviewing patches should gently increase over time.
    • The number of patch deciding on patches should gently increase over time.
  • Responsive to code
    • 100% of staff-written patches in maintained code repos should be reviewed within 1 week.
    • 75% of staff-written patches in maintained code repos should have an outcome within 1 week.
    • 100% of volunteer-written patches in maintained code repos should be reviewed within 2 weeks.
    • 75% of volunteer-written patches in maintained code repos should have an outcome within 4 weeks.
    • Age of latest comment on patches in maintained code repos should be less than 4 weeks for all open patches.
  • Responsive to config
    • 100% of patches in maintained config repos should be reviewed within 2 days.
    • 100% of patches in maintained config repos should have an outcome within 2 weeks.
    • Age of latest comment on patches in maintained config repos should be less than 1 week for all open patches.

Concerns not captured in this cut

  • Tie to Phabricator status
  • Lack of follow-up from submitter
  • Rounds of CR before merge
  • Ping-in-gerrit / ping-in-Phab / ping-in-Scrum-of-Scrums vs. outcome

Regarding https://wikimedia.biterg.io (see https://www.mediawiki.org/wiki/Community_metrics for docs) and Gerrit only, not Github: