Page MenuHomePhabricator

Wikimedia Technical Conference 2019 Session: Code Health/Code Review
Closed, ResolvedPublic

Assigned To
Authored By
Oct 4 2019, 3:49 PM
Referenced Files
F31064798: breakout02.jpg
Nov 13 2019, 10:25 PM
F31064794: breakout01.jpg
Nov 13 2019, 10:25 PM
"Love" token, awarded by Tgr."Meh!" token, awarded by zeljkofilipin."Like" token, awarded by brennen."Evil Spooky Haunted Tree" token, awarded by Addshore."Like" token, awarded by MusikAnimal."Like" token, awarded by Jrbranaa."Like" token, awarded by Aklapper."Love" token, awarded by hashar."Like" token, awarded by Nikerabbit."Like" token, awarded by kostajh.



  • Track: People and Processes
  • Topic: Code Health/Code Review


Code Health group: Current status, next steps & discussion on best code review practices.

Earlier this year, the Code Health Group started a work group to focus on the challenges that we face regarding Code Review, both within the Wikimedia Foundation as well as the broader community. The work group was initially tasked with defining the problems that we face and then work towards addressing those problems.

During this session we will look to share the progress made thus far as well as look for additional help and guidance towards addressing our challenges. One aspect that we've not delved into yet as a work group is defining the characteristics of effective code reviews.

Questions to answer and discuss

Question: What are some the key challenges that are faced in managing the code review workload?
Significance: Part of the challenge is simply doing the work. Understanding those barriers will help us devise tools and processes that enable this work getting done.

Question: What do effective code reviews look like?
Significance: Once we are able to do the work, we need to understand how to get the most out of the activity as possible. With all the tools that aid in assessing code quality and such, it can be easy to overlook the software engineer's most valuable contributions to the process.

Question: How do we measure success (what are the metrics)?
Significance: As with most software engineering activities, we do them because of their benefit (one would hope). Understanding the desired benefit and being able to measure if those benefits are being realized is important.

Related Issues

  • Code Stewardship
  • Code Health Metrics

Pre-reading for all Participants

Session Leader(s)


Session Scribes

Session Facilitator

Session Style / Format

  • discussion/problem solving. May breakout into groups depending on number of participants.

Post-event summary:

  • Overall it remains difficult for people (both staff and volunteers) to get code reviewed in a timely manor due to:
    • not knowing who should review
    • reviews that are not in line with priorities (OKRs) are delayed/ignored
    • lack of visibility for those that could review
  • Although there are many good examples of individual code review practices, there are few team-based approaches to managing the Code Review process.
    • Core Platform team as recently adopted a team-based process that has had a positive impact.
  • Key to valuable/successful code reviews is better synchronicity between the vote ( +1, +2, -1, etc...) and the comments.
    • "Looks Great" with no vote, is not helpful.
    • -1 with not guidance is also not helpful.
  • Measuring success is twofold -
    • How effective is the process of code reviews
      • Time to first comment
      • Time to review completed
    • How effective are the code reviews themselves
      • Closed loop of escaped bugs to see if Code Review should have caught them.

Post-event action items:

  • Collected feedback and ideas to be reviewed by the Code Review WG.
  • Impact of feedback to Code Review WG activities will be shared back to this ticket.

Etherpad Notes

Taken from

  • AK: Welcome to traditional code review session.
  • Programming: History, breakout groups (exchanging practices and routines), breakout groups (effective code reviews), wrap up / next steps.
  • What's already happening, what we're doing, other good ideas and experience to share.
  • Dev satisfaction survey: code review satisfaction is by far the worst graded aspect
  • [rabble rabble]
  • [slide] History
    • Several previous conference sessions
    • Gerrit cleanup day
    • IRC office hours
    • Reminder: Bikeshed'y because it's complex. Social, technical, organizational aspects. Roles: Contributor, reviewer. Try to avoid bikeshedding.
    • Andre wrote some thoughts on the code review on his user page:
  • [slide] Statistics
    • Growth team, 13 Gerrit repos. Of 938 patches in the last 12 monhts, 58 not merged/not abandoned
    • Future work planned for - filtering by gerrit label, time to first review data
  • [slide] Some tools, at least - gerrit project dashboards
    • Might not be well known
  • Breakout 1 - the exchange - 10 minutes
    • What are your routines?
    • What tools do you use to manage code review?
    • What are some successes you've had?
    • Challenges?
    • There'll be a 5 minute presentation.

Breakout team stage right (Tyler, Daniel, Nikerabbit, Kosta, Jeena, Antoine, Gergo)

  • inbox
  • Team dashboards
  • Routine: Look at dashboard once a day
  • Weekly or daily team meeting / standup, raise things that need review
  • "No code review, merge directly into repo" - because many things are minor configuration changes
  • SRE has clinic duty role and tries to keep look at incoming requests / patches - not primarily for patches, but catches some (for example access requests)
  • DZ: Learn who you have ping on IRC, who you have to add on gerrit, etc. Wish we didn't have to do the IRC ping thing since it's realtime and not async.
  • Antoine: Challenge - some people won't review code unless you ping them on IRC.
  • Riccardo: Maintainers page
  • DZ: Separate things: the auto thing, ???
  • Mate: Haven't had a high success rate with the automated gerrit review system - would send us people who have gone completely inactive.
  • Riccardo: No way to keep that list updated?
  • Successes:
    • JH: Not that hard when you don't get that many things to review...
  • Challenges:
    • Growth team unmerged patches - of the 58 listed probably 50 of them relate to extensions we never use, have never worked on, but we're responsible for - for me to merge something, I'd have to actually check it out. We're stewards for stuff that we don't actually steward. We don't have time allocated for these things.
    • Finding the right people.
    • Finding the right incentive structure.
    • Filtering all of the incoming requests for review for what you care about.

Breakout team stage left (Timo, Dan, Sam, James, Florian, Fisch, Daniel K)

  • Fisch: use scrum, we have a Revie wcolumn on our sprint board, things requiring review are placed there. This does not necessarily work in all cases. Second practice: we mostly work on extensions, when working on extension X we subscribe for automatic notifcations. We also mention a need for review for own patches in the daily standups, also pointing a need for review for other's patches
  • JamesF: four permanently opened tabs with saved search: patches by my team, patches not by my team but in repos of my team, patches in repos I care about, patches in repos I care about extended version - I check these multiple times a day; I also am subscribed (via gerrit functionality) to certain parts of certain repositories
  • Sam: I have a code review kabal, I know specfic people who I know will review certain things
  • JamesF: after reviewing patches by individual X, I often check their other patches
  • Dan: we're completely isolated. We run kanban, we have Code Review column, and we regularly review those. Outside of our team, we now also have a network of people we know who to ask for help reviewing
  • Timo: 1. My team has a gerrit dashboard, wth several section: what is assigned to me - in gerrit sense - high priority stuff; repositories we maintain; patches from our team members in other repositories. 2. Personal practice: I got a notification of every patch merged in MW core, Vector. 3. Alsso have certain bookmarked gmail searches in gmail mailbox which moves gerrit notification emails to dedicated mailbox folders.
  • DanielK: I have a bookmarked tab with Gerrit search. 2. Phabricator board column. 3. I also bug James for review of things I want reviewed

Team presentations

  • Stage right
    • Talked about ???, mixed routines and tools - listed what people use. Inbox on gerrit notifications. Dashboards. Some efforts to make dashboard effective. Others are raising awareness during meetings - standups daily or weekly.
    • Some people don't react to inbox or dashboard so you have to ping on IRC directly to get review.
    • Assigned field in gerrit is underused - no shared understanding of how/when to use it.
    • Auto-reviewers added by bot were not considered very useful, often people who are no longer around would be added
    • Challenges:
      • Finding the right people to add to review is a challenge
      • No incentive structure to spend time on reviews - time away from OKRs and major work
      • You get noise for patches that you don't need to look at
  • Stage left
    • Many things already said
    • Getting assigned automatically, also using the Assign field in Gerrit
    • Review column on Phabricator boards
    • poking people
    • Some people use gerrit dashbboard
    • email notifications from gerrit, some folks have some filters/saved gmail searches
    • Mentioning patches in standup
    • Open browser tabs with searches in gerrit - keeping it open and looking at it
    • We didn't get to the successes and challenges part of the excersise

JR: How many of these are individual vs. team?

  • Mostly individual.
  • JF: Most teams will have indivduals who do a lot of code review (leads, etc.)

Second breakout - Effective code reviews

  • what are some of the key challenges that are faced in managing the code review workload?
  • What do effective code reviews look like?
  • How do we measure success (what are the metrics?)

Also consider: are you getting from the code review what you expecting to get from it?

Breakout team stage left: (Timo, Dan, Sam, James, Fisch, Florian, Daniel)

  • Key challenges:
    • Florian:one of the key challenges for the volunteer is finding to review, it is pretty difficult, no matter if MW core or extensions. It seems to me it is getting worse. It was hard 3 years ago, teams only take care of their own code changes.
    • Timo: This is also a problem in WMF. When I submit a patch in CentralNotice, how do I know when this is going to be picked up, as process differ.. Other team's workflows should be discoverable, this should be discoverable from the Gerrit level (it should be in the README of the code the team owns).. Antother example: CPT only looks at changes that have phab tasks attached
  • What does the effective
    • Florian: Optimal +2. Other than some guidance what needs to be changed. What is not helpful is +1 or -1 without a comment. The communication
    • James: Review without the vote is also not useful. By giving a code review you should feel responsible to pass the further review to the respective person.
    • Daniel: There some social challenge with giving -1 comments.
    • Daniel: Saying no without the way forward is also not helpful
    • Sam: Would it be useful to define some "workflow of code review", e.g. what to do after providing a dry-through review without merging the change
    • Timo: when I see the commit I cannot improve in some way, I do explain why it cannot move forward. Other peoples do not do it, because they don't have a roadmap to which they could refer saying "I cannot see it on the roadmap"
    • Florian: so does it happen that you see that code which is fine, but you wouldn't merge it because it is not on the roadmap?
    • Others: yeah, that might be the thing don't even want to have
    • Grant: There could even be things that go against the roadmap

Breakout team stage right

  • Challenges:
    • Very first patchset you do gets reviews immediately, then you fix some nit or other minor issue and it's radio silence. You fix the -1 and then no more response.
    • When the patch runs out of the easy bikesheddable issues
    • "Should this be done at all", "is the architectural approach right" - these are hard questions
    • Context-switching is hard unless you're doing it in a short time - a close loop works, but it's hard across timezones and the like
    • Ideally code review should be fast
    • Smaller patches tend to be more effectively reviewed
    • Lots of newcomers see it the opposite and think they have to do one perfect patch - but really should be incremental
    • Article Mukunda sent around:
      • Semantic perfection immediately is a great way to kill off a volunteer initiative
      • "You should merge the patch once it demonstrably makes things better"
      • Perfect is the enemy of the good.
      • There are differences in what people want here
      • Can be helpful to tag commentary - [nitpick], [comment], etc.
      • There can be social difficulties if you change someone else's patchset
  • Metrics
    • Speed to merge, if the ideal code review is fast...
    • Time to first comment? First response / etc.
    • User satisfaction

Team presentations

  • Stage left
    • It is hard to find reviewers for both volunteers and WMF staff members, because not clear who maintains the part of the software in question
    • Antipatterns - comment lists, encouraging but meaningless +0s
    • General stonewalling - just no without helpful guidance.
    • Include guidance with your criticism! Help to move patches forward.
    • workflow for code review instead of just casually leaving it there
      • If your +1 is sticking around, try to move it foward.
    • Integration with the product roadmap (thinkign of the bigger picture, what does this mean if the patch X gets deployed)
  • Stage right
    • Effective code review: Follow up. Trivial fixes shouldn't be left hanging.
    • Effective code review: short loop to inprove the patch and get it merged
      • Can be difficult across timezones, etc.
    • Should we do this thing at all?
    • small patches get reviewed easier than bigger ones
    • avoid perfectionism, accept that this might the best we could get
    • ???
    • Metrics
      • Time to first comment
      • Time to have code review completed
      • Satisfaction - "how was security today?"
      • measure what people in which team are involved in the code review
      • tracking bugs caused by patches, to be able to identify code reviews that didn't work and understand what was missing there to make it a good code review

Next steps:

  • Outcome of this session is going to be reviewd and acted upon the Code Review Working Group.
    • ACTION: Join the Code Review Workgroup!
    • Problem scope of the Working Group:
      • Prioritization/Resourcing
      • Tools/Processes
      • Merge Confidence
    • Sub groups:
      • Code review management tools and process
      • code review metrics (there is a document by James) <-- TODO: link it here
      • code review office hours reboot
  • drop ideas to Andre/JR or to the phabricator tasks of this session

Event Timeline

kostajh added a subscriber: kostajh.

I'd be interested to hear specific feedback about how the codehealth pipeline assists (or doesn't) with code review. But there are potentially many directions this session could go, and I'm fine if that doesn't end up as one of the discussion points.

I'd be interested to hear specific feedback about how the codehealth pipeline assists (or doesn't) with code review. But there are potentially many directions this session could go, and I'm fine if that doesn't end up as one of the discussion points.

FWIW I love it. I have caught a few bugs in my patches before a code reviewer even saw my patch. It has also incentivised me to write more tests thanks to nice code coverage (not necessarily good ones, but better than no tests) and those tests have uncovered more bugs. -> Less round trips in actual code review.

I will be more than happy to assist for the code review principles and experience as well as how code health tools assist. Writing good reviews requires some good guidelines and principles.

I'm happy to help with this session when it comes to code review aspects.

Depending on the amount of time we have for this session, I suggest that we limit it's scope to Code Review vs the broader Code Health topic. If we have enough time, I'd be happy to provide and overview of the other Code Health Group activities in play.

Happy to co-lead this session with Andre.

I'd be interested in code health and it would be lovely if we could see about getting some statistics from the past year for instance, see if we can also find some empirical evidence next to the anecdotal evidence about progress made etc.

Depending on the amount of time we have for this session, I suggest that we limit it's scope to Code Review vs the broader Code Health topic.

+1. As far as I can see code health work is going well (re: @kostajh I would echo Niklas, I only wrote a handlful of patches that went through code health checks, and it already helped me catch mistakes; and the UI is much saner than our traditional CI), while code review is fundamentally broken and we aren't really sure what to do about it.

Hey all, thanks for all the comments and support! @Jrbranaa, @Aklapper please go ahead and re-assign the task to one of you :-)

debt triaged this task as Medium priority.Oct 22 2019, 7:08 PM
debt updated the task description. (Show Details)
debt updated the task description. (Show Details)

(Programming note)

This session was accepted and will be scheduled.

Notes to the session leader

  • Please continue to scope this session and post the session's goals and main questions into the task description.
    • If your topic is too big for one session, work with your Program Committee contact to break it down even further.
    • Session descriptions need to be completely finalized by November 1, 2019.
  • Please build your session collaboratively!
    • You should consider breakout groups with report-backs, using posters / post-its to visualize thoughts and themes, or any other collaborative meeting method you like.
    • If you need to have any large group discussions they must be planned out, specific, and focused.
    • A brief summary of your session format will need to go in the associated Phabricator task.
    • Some ideas from the old WMF Team Practices Group.
  • If you have any pre-session suggested reading or any specific ideas that you would like your attendees to think about in advance of your session, please state that explicitly in your session’s task.
    • Please put this at the top of your Phabricator task under the label “Pre-reading for all Participants.”

Notes to those interested in attending this session

(or those wanting to engage before the event because they are not attending)

  • If the session leader is asking for feedback, please engage!
  • Please do any pre-session reading that the leader would like you to do.

Just poking the fact, that there are probably some useful ideas on what could be done in the field of code review in the action items from T234662: Wikimedia Technical Conference 2019 Session: Integrating contributions from other teams or volunteers.

Also promoting the unconference idea of implementing some of the things at the conference T234662#5661807.

First breakout session:

breakout01.jpg (1×3 px, 1 MB)

Second breakout session:
breakout02.jpg (2×3 px, 1 MB)

Thanks for making this a good session at TechConf this year. Follow-up actions are recorded in a central planning spreadsheet (owned by me) and I'll begin farming them out to responsible parties in January 2020.