Page MenuHomePhabricator

Lessons learned from the first Gerrit Cleanup Day
Closed, ResolvedPublic

Description

This is a followup to T88531: Goal: Organize a Gerrit Cleanup Day on September 23, 2015 and is to be incorporated in T115844: Have a second Gerrit (Code Review backlog) cleanup day.

Regarding the Cleanup Day itself

Communication before the Day

  • ☑ Weekday: Chose Wednesday to not end up merging changes before branching / deployment train
  • Timing: End of quarter isn't great as teams are busy anyway. Also note that some teams do have high-priority time-sensitive tasks.
  • Agreeing on a Cleanup date and getting commitment seems to be easier in a WMF Engineering management meeting instead of sending out emails
  • TODO: Add a banner on Gerrit as [[ https://gerrit-review.googlesource.com/Documentation/config-themes.html#_html_header_footer | a theme with GerritSiteHeader.html ]]
  • ☑ Edited Phabricator default dashboard to display announcement on homepage
  • TODO: Request a banner on mediawiki.org
  • TODO: Add an event to Phabricator calendar
  • ☑ Sent early email to wikitech-l, pywikibot, ops, wikidata mailing lists
  • ☑ Sent email to internal WMF Engineering list
  • ☑ Added entry to internal WMF Engineering calendar
  • ☑ A week before, contacted teams (mailing lists and team leaders if existing) whether they are ready or not help (e.g. Pywikibot, Wikidata)
    • TODO: Some partially rather unused lists (e.g. performance@) were contacted as no better/clearer ones were found or existed
    • TODO: Must send earlier than one week: Ended up in some moderation queues without getting timely clearance; plus hard to find out names of some mailing list (e.g. fr-tech)
  • ☑ Two days before, sent reminder to wikitech-l
  • TODO: More widely announce, e.g. tech-ambassadors@? Where is our extended target audience?

Social aspects

  • Different levels of team preparation and enthusiasm. Examples: Discovery team planning page; Yuvi's internal ops@ email listing potential reviewers for patches; RelEng preparations.
  • Different understandings of expectation of joining among teams or individuals ("is it really mandatory to join?")
    • "I did some reviews, but basically our team considers itself too busy to participate. More accurately: When it was discussed, it was left up to each developer to decide themselves whether they want to participate. I wish we would have done more."
  • Basically no cross-team interaction seen in #wikimedia-dev (or it happened in other places?), everybody sticking to their areas.
    • TODO: List of main contacts (Name, IRC nick, IRC channel) to also include timezone info?
    • TODO: No formal buddy system was created - wikipage with section "I'm looking for a review-buddy"? Europeans (early timezone) with only few changesets: How can they help others and "find a buddy" in their timezone? How to improve involving volunteers, also those who realize on IRC that this is going on right now?
    • TODO: Praise: Stats to highlight the most active/effective developers / repositories / teams during the Day. More efficient communication structure welcome. No clear numbers on participation of WMF teams and who exactly was (not) involved. Asking "How is the Cleanup Day going? How can we help?" in many IRC channels is not the most efficient way, checking with each main contact (timezones!) is neither; following the Gerrit notifications is cumbersome.

General Code Review aspects / issues

Evaluation

  • Total number of changesets waiting for a reviewer action and Unknown affiliation: 628 (was 751 ~ -17%)
  • Total number of changesets waiting for a reviewer action and Independent affiliation: 124 (was 159 ~ -23,1%)
  • Total number of changesets waiting for a reviewer action, Unknown affiliation and open in the last three months: 261 (was 335 ~ -22%)
  • Total number of changesets waiting for a reviewer action, Independent affiliation and open in the last three monthts: 53 (was 71 ~ -25,4%

Additional statistics show a 10 % decrease in open patches:

Regarding the measurements of success, we have failed:

  • 👎 All WMF developer teams join the Day.
  • 👎 All patches contributed by volunteers and all patches from the past 3 months have at least one review.

Related Objects

StatusSubtypeAssignedTask
ResolvedQgil
ResolvedDicortazar
ResolvedAklapper
ResolvedAklapper
ResolvedAklapper
ResolvedDicortazar
ResolvedDicortazar
ResolvedAcs
ResolvedDicortazar
ResolvedDicortazar
ResolvedDicortazar
ResolvedDicortazar
ResolvedDicortazar
ResolvedDicortazar
InvalidDicortazar
ResolvedDicortazar
ResolvedDicortazar
ResolvedAklapper
ResolvedDicortazar
ResolvedDicortazar
DuplicateNone
ResolvedDicortazar

Event Timeline

Aklapper claimed this task.
Aklapper raised the priority of this task from to Medium.
Aklapper updated the task description. (Show Details)
Aklapper added a project: DevRel-October-2015.
Aklapper subscribed.

We should define some sort of cut off for just abandoning commits (over some date, even more so if CR-1/CR-2 or V-1). They're just clutter. They still exist, and if someone wanted to work on them, they can be easily restored.

Probably with a nice summary, "thanks for the commit, you can restore this at a later date if you plan on working on it" etc

For example mw core has 29 commits untouched since 2013...

You could argue that untouched patches over a year old are candidates for abandoning etc... Core has around 100 commits that are a year old

Maybe it would help to generate list of commits to reviews per project/authors. That might attract people to do the cleanup.

For WMF, maybe it can be a special day at the office with people gathering around tables to do the cleanup and not be disturbed with other duties.

For WMF, maybe it can be a special day at the office with people gathering around tables to do the cleanup and not be disturbed with other duties.

"The site is down! Tough! I'm doing Code Review and Gerrit Cleanup!"

Aklapper set Security to None.
Aklapper added a subscriber: Qgil.

I've dumped my thoughts and impressions above.
Comments and discussion welcome.

As I said elsewhere: The problem isn't "management", but "reviews". Nobody is helped by the assurance that their patch might (!) be reviewed at a Review Day in a few months time where there is social pressure to process as many patches as possible, so that in the case of a (probable) negative review they might (!) get another chance in another few months time.

I don't find the posting to wikitech-l at the moment, but IIRC the mobile team had a policy to review each contribution to "their" repositories in x days. IMHO that's the way to go, no organization or travelling and cookies needed, but a simple script that alerts "maintainers" (to be defined) every week that they have y patches waiting for review.

Nemo_bis removed a subscriber: scfc.

As a volunteer, I didn't really have much idea how to get involved, though this seems like a good way, potentially, to get all manner of folks from the community working together on the same things as well. I tried asking in #wikitech-l during the thing, but nobody who seemed to know anything was around and then I just got distracted and nothing came of it at all.

Something to maybe consider, I dunno.

Mass comments like "-1 needs manual rebase" are NOT helpful unless it was the intention of the reviewer to merge the patch. The -1 discourages other reviewers, without actually checking the substance of the patch. And we know that gerrit's "Can't merge" indicator is often broken, there are plenty of times when actual git rebases the patch just fine.

We should define some sort of cut off for just abandoning commits (over some date, even more so if CR-1/CR-2 or V-1). They're just clutter. They still exist, and if someone wanted to work on them, they can be easily restored.

Probably with a nice summary, "thanks for the commit, you can restore this at a later date if you plan on working on it" etc

For example mw core has 29 commits untouched since 2013...

You could argue that untouched patches over a year old are candidates for abandoning etc... Core has around 100 commits that are a year old

This is a standard practice in some open source projects, and I think it would be a good idea for us as well. We can start off incredibly generous, only abandoning patches that havn't been touched in a year. But other projects (such as HHVM) will close open patch sets with three weeks of inactivity. It's not like the patches are lost, abandon just means its not going anywhere. Anyone who wants to can restore the patch set.

This is a standard practice in some open source projects, and I think it would be a good idea for us as well. We can start off incredibly generous, only abandoning patches that havn't been touched in a year. But other projects (such as HHVM) will close open patch sets with three weeks of inactivity. It's not like the patches are lost, abandon just means its not going anywhere. Anyone who wants to can restore the patch set.

We had past discussion about it http://www.gossamer-threads.com/lists/wiki/wikitech/450845 . Maybe we can fill a sub task as a follow up?

First of all, let me say thank you to @Aklapper and all the people involved in this experiment. You made it! I think you made a difference indeed, and we will continue to work cleaning up our code review queues.

The main conclusion we need to make is whether it is worth organizing a next Gerrit Cleanup Day. I think so. We made a dent on 17%-25% of patches submitted (presumably) by volunteers that hadn't received any review. It's not the goal we had set (slightly blindly, but well) but it is a remarkable number. Thanks to this Day we created Gerrit queries that continue to be useful, and we raised the awareness on this problem. This and other related actions we are taking should help keeping healthier numbers, and hopefully should help also organizing a more successful second Gerrit Cleanup Day.

On promotion and following of the Day, do we have any idea about the participation of WMF teams? Did everybody know? Was everybody involved? Were there people that knew but didn't know how to help? Was someone ignoring this action, writing / committing code that was not urgent to deploy? Not to go after people, but to see what can we as organizers do better next time.

And the other way around, do we have a way to highlight and praise the most effective developers / repositories / projects / teams during the Day?

Was the activity counterproductive in any way, for instance promoting unproductive -1 reviews as @Legoktm points out? Can we prevent this?

Also following on @Isarra's comment, how can we improve involving volunteers, also those that might realize on IRC etc that this activity is going on right now?

We should also recover the comment made my Mark somewhere, that we should not organize the next Day so close to the end of the quarter, when most of us are running to match deadlines and working on extra documentation, a situation that is more rare during the rest of the quarter.

PS: fresh tasks T113706: Define Cut-off date for abandoning old changesets in our code review tool and T113707: Decision on Wikimedia Foundation service-level agreement on code review times.

There was a lot of churn in WMF codebases. Although one of the explicit goals of the cleanup day was *not* to hastily merge patches, but instead to offer good reviews, I think we should pay next attention to the next Mediawiki train deploy (week of Sep 28) to see whether this cleanup day set more wikis ablaze than usual.

If so, I'd suggest that we try to accommodate the cleanup day in the deploy schedule somehow next time. Perhaps skip the group2 deploy for the week following and let the code live on the group1 servers for an extra week to get hammered on, or something like that.

But if deploy goes swimmingly, then maybe this extra concern is misplaced. We shall see next week!

(Alternatively, if this becomes a more regular activity, one might hope that *not quite as many old patches* will need to be reviewed/merged. That would also reduce deploy risk.)

Aklapper raised the priority of this task from Medium to High.Sep 27 2015, 6:27 PM

@cscott the Gerrit clean up day has been taken in account by Release-Engineering-Team and extra care is taken for the deployment train that follow. We definitely want to keep it in mind for the next cleanup day for sure.

The Discovery department (finally) did a quick retrospective on gerrit cleanup day. I'm not sure how to integrate these notes into the categories in the main body of this phab ticket:

  • Not many pending patches for Discovery, so there wasn't much direct benefit to us
  • (Some developers) did good work in core; it was good for the org overall
  • It was unclear how to prioritize our team vs. backlog items in general
  • Still getting a couple emails a week on issues raised during cleanup day (which is a good thing)
  • (Some developers) learned about other parts of the codebase
  • Not much for data analysts to do (unclear whether more communication across departments would have helped)
  • Tangential takeaway: We should be interacting more with the community

There is a proposed WikiDev16 session about improving code review, especially for volunteers: T114419: Event on "Make code review not suck". You are welcome to comment there if interested.

Aklapper updated the task description. (Show Details)
Aklapper updated the task description. (Show Details)
Aklapper updated the task description. (Show Details)

Thank you everybody who provided feedback in this task.

My general impression is that it's worth to learn and repeat this experiment. Hence I created T115844 for a future Code Review Cleanup Day.

I've split this task's summary into the sections:

  • Regarding the Cleanup Day itself
  • General Code Review aspects / issues
  • Evaluation

and created followup tasks for all TODO items related to general Code Review aspects (see links above).

Closing as resolved.

Mass comments like "-1 needs manual rebase" are NOT helpful unless it was the intention of the reviewer to merge the patch. The -1 discourages other reviewers, without actually checking the substance of the patch. And we know that gerrit's "Can't merge" indicator is often broken, there are plenty of times when actual git rebases the patch just fine.

In the Wikidata team we moved to marking these changes as V-1.

I just learned about this bug from the Developer Relations Weekly Summary (thanks @Qgil) and I would like to say big thank you to everybody involved. Some of my oldest stuff got suddenly rebased by a group of reviewers, I was really surprised.

Frankly I thought this is some kind of a weekly cleanup action (a bit sad to hear it was a one-time event for now), but from my perspective the experience was very positive.