Page MenuHomePhabricator

Develop an easy to use workflow to monitor new developer patches that land in Gerrit
Closed, ResolvedPublic

Description

Background
Currently, there is no streamlined process for watching new developer patches that land in Gerrit. And, this is why it is cumbersome to offer help to new authors or reviewers who might need it in this area.
Developer Relations relies on wikimedia.biterg.io for obtaining names and dates of new authors who submitted their very first changeset into Gerrit. But, this as of now is blocked by a technical upstream constraint: T151161. Until late 2016, attempts have been made to send requests for code review to Wikitech-I, but then migrating from korma.wmflabs.org to wikimedia.biterg.io (both provided by Bitergia) affected this process.
We also have an Open changesets by newbie owner script, that provides direct links to Gerrit. But, the way it is right now, it is of little use to us. It is not regularly updated, does not provide an option to view patches within a specific date range, also lists those authors whose changesets were taken care by others later, or were submitted about years ago.

Possible solutions

  1. An easy and quick solution would be to improve the script or wait for the following issue to get resolved: T151161.
  2. A refined workflow could consist of two parts (which makes the job of DevRel's and others easier):
    • A tool (possibly leveraging mzmcbride/gerrit-reports) that would help:
      • view/monitor new volunteer patches (e.g. view patches within a specific date range)
      • maintain workflow for reviewing patches (e.g. able to mark patches considered as done)
      • view patch activity of new developers who participated in our outreach programs and events
    • Facilitation around the tool that could consist of:
      • Connecting code review seekers with reviewers
      • Directing new authors to follow Getting reviews guidelines
      • Directing reviewers to best code reviewing practices
      • Helping teams use the workflow to monitor patches submitted in their own area
      • Making more teams aware that Gerrit (cross-repository) dashboards exist and how to set up Gerrit dashboards
      • Understand how other teams have monitored patches in the past or what practices they have adapted to get better at code review so that we can share them with others more widely.
  3. <add your idea here>

See also

Related Objects

Event Timeline

With T187895, as we have access to seamless widgets that helps us view new authors changesets, looks like we are all set on the technical implementation side now to proceed with the facilitation part (as mentioned in the possible solutions) of this task's description.

@Nemo_bis wrote the wiki report in September 2013: https://github.com/nemobis/gerrit-reports/commit/93858c6c829e17dc030af001f057f04282879685#diff-47f89ceb825fcc72b0b09ddfbf9f99a4.

Looking at https://www.mediawiki.org/w/index.php?title=Gerrit/Reports/Open_changesets_by_newbie_owner&action=history indicates that the report is still being updated by the bot, at least. I have no idea what the quality of the data is.

I added you (@srishakatux) as a maintainer of the "gerrit-reports" Toolforge account via https://toolsadmin.wikimedia.org/tools/id/gerrit-reports in case you want to modify the script or look at its underlying data, which could easily be out-of-date or incorrect.

@MZMcBride Thanks a lot for sharing this information and for adding me to the account. I will look into if we feel the need to take the "modify the script" route to move forward on this task.

With T187895, as we have access to seamless widgets that helps us view new authors changesets, looks like we are all set on the technical implementation side now to proceed with the facilitation part (as mentioned in the possible solutions) of this task's description.

I'd say that remaining technical restrictions are fixing T151161 and not being able to have one list of all patches in Gerrit written by all newcomers who started contributing after day X (only a list of all patches per each single newcomer).

@srishakatux and I discussed potential approaches / ideas. The following comment touches both T186721 (to merge into this task?) and T188244, I'm afraid.

General thoughts:

  • I'd currently not investigate creating (and maintaining!) a separate tool (querying Gerrit API) from scratch. Maybe in the long run. Let's start with what's in place.
  • Obviously, any workflow should be "public" and not depend on a single person's account's access rights ("my Gerrit accounts 'My Review' dashboard") or mail inbox ("my notifications from Gerrit for those patches by new contributors which I subscribed to") etc.

Furthermore, there are three different aspects:

  • "This patch has not received any review / feedback at all" (CR0). Because appreciation that you contributed a patch.
  • "This patch needs improvement, follow up with the developer if they plan to improve it" (CR-1). Because we're interested in getting your patch merged.
  • "This patch needs someone with +2 rights to merge this patch" (CR+1). Because satisfaction that your patch got merged.

@Aklapper Keeping in mind that let's not create another tool and investigate solutions that not become a one person thingy, here is a thought (not new though, it has been mentioned elsewhere too T73357: Add a welcome bot to Gerrit for first time contributors):

I stumbled upon OpenStack’s tool called Jeepyb which is a collection of tools that make use of Gerrit for the OpenStack community easier. In this tool, they have a welcome_message.py script that adds a “Welcome! New contributor” as one of the reviewers to a patch in Gerrit and sends a welcome message too.

I think it helps in two ways: a) provides a solution to welcome newcomers to Gerrit and b) a process to query newcomer patches through the following query:

status:open reviewer: "Welcome, new contributor!"

Though, the name of the reviewer, IMO could be a more human-like bot and not “Welcome, new contributor.. ” and the message could also be a bit more elaborated and fun.

Do you think this would solve our problem in the long run?

I don't know if that would solve some problem (depends on the workflow - you would only be able to find the very first patch by someone if you query by that string), but it would be very welcome to fix T73357 anyway...

(For the records, I usually manually added such comments so far in the example extension in Gerrit, see https://gerrit.wikimedia.org/r/#/c/417889/)

To summarize my findings / opinions from the subtasks:

I don't know if that would solve some problem (depends on the workflow - you would only be able to find the very first patch by someone if you query by that string), but it would be very welcome to fix T73357 anyway...

A few more details: I am imagining here that writing a plugin or a script (which is hosted on Toolforge) that would allow us to query new contributor patches like how Libre Office or OpenStack does, might be helpful..

There are two implementation approaches that I see to this:

  • 1) Easy and quick workaround: Add a #first-time-contributor tag to the reviewers and new-contributor to first five patches submitted by a newcomer (like how OpenStack adds "Welcome! New Contributor!"). I also noticed that we have these interesting labels that let us perform queries such as status:open label:Code-Review<0 Verified>0. So, then I imagine that the query could be something like this:
status:open label:Code-Review<0 Verified>0 reviewer: first-time-contributor OR new-contributor
  • 2) Just like how LibreOffice has a way for querying newcomer patches, we could also consider creating a GROUP called “Newcomers” on the backed (not necessarily based on the access permission like LibreOffice do), but based on how many patches were submitted, how many were merged, and reviewed successfully. So, then the query could be:
status:open label:Code-Review<0 Verified>0 owner:newcomers

Implementation wise, I am assuming the second one might be a bit tedious, and we would have to write a plugin for this.

Note to self: other scripts related to Gerrit:

I am asking Paladox and Chad, if writing a patch or hosting a script on Toolforge should be the way to go about this.

Thanks Srishti for looking this up. It's convincing. The huge advantage of using Gerrit itself for such queries instead of wikimedia.biterg.io is that when can easily customize by querying for the labels (any CR-1 / V+2 etc).

I like the LibreOffice workflow a lot with a small number of queries on https://wiki.documentfoundation.org/Development/gerrit/CommonQueries
However we'd need a group on Gerrit (because that query on Gerrit uses -ownerin:Committers) and/or that welcoming bot.

(Openstack link that works for me: https://review.openstack.org/#/q/status:open+reviewer:%22Welcome%252C+new+contributor!+(10068)%22 )

Trying to summarize discussion:

Srishti to potentially come up with / document queries that could help us ( https://gerrit.wikimedia.org/r/#/q/ownerin:newcomers+is:open+-label:Code-Review%253C%253D-1+-label:Verified%253C%253D-1 ? etc)
Then to discuss a workflow: Us (who?) to check those queries regularly and add potential reviewers (https://www.mediawiki.org/wiki/Developers/Maintainers might help or not), and after doing so in Gerrit adding a comment that mentions that it is a new contributor; and/or resurrecting weekly manual emails to wikitech-l@ again which list patches that are awaiting review (@Aklapper can do that).

@Aklapper I've documented four queries here: https://www.mediawiki.org/wiki/Gerrit/Code_review#Finding_patches_to_review.

As a next first step, I wonder if the royal we (you and I) should start monitoring newcomer patches on a weekly basis and follow the facilitation process below to ensure that patches get reviewed:

  • Connect code review seekers with reviewers
  • Remind reviewers (by pinging them on IRC) to take care of newcomer patches unattended for a long time
  • Encourage new authors to follow Getting reviews guidelines
  • Encourage reviewers to follow best code reviewing practices

In this facilitation process, we share/document our observations (perhaps in a Google Doc), what are we lacking, what next steps we could take or encourage maintainers, developers, etc. take. An outcome of these observations could then be a blog post or improvements to our existing documentation as Bryan mentioned in our meeting?

@srishakatux: Question: For query 3 (Open changesets by new contributors successfully verified by Jenkins bot and awaiting reviewer feedback (V=1 && CR=0)), shall we change the condition to ownerin:newcomers status:open label:Verified>=1 label:Code-Review=0? V+1 means basic test build succeeded; V+2 means main test build succeeded. I don't see a need to differentiate.

We have four queries on https://www.mediawiki.org/w/index.php?title=Gerrit/Code_review&oldid=2801025 :
query #1: All open very first changesets by new contributors (V=* && CR=* && ∑(CS)=1)
query #2: All changesets by new contributors (V=* && CR=* && 1<=∑(CS)<=5)
query #3: Open changesets by new contributors successfully verified by Jenkins bot and awaiting reviewer feedback (V=1 && CR=0 && 1<=∑(CS)<=5)
query #4: Open changesets by new contributors not successfully verified by Jenkins bot (V<0 && CR=* && 1<=∑(CS)<=5)

To help my brain understand the subsets (Note to self: #2 has some changesets not in any subset, like V+1 && CR-1 && 2<=∑(CS)<=5):

bleh.png (400×500 px, 61 KB)

@srishakatux et al: My understanding / interpretation so far (someone tell me if I'm overcomplicating things):

  • For V<0 && CR=* (query #4)
    • we want the contributor to make their changeset pass Jenkins (means: reach V>0) before we want a human review. We add a comment asking if the contributor needs any help (including pointers where to ask outside of Gerrit?) and if the contributor plans to continue improving the changeset. This might not always work out as volunteers might lose interest; we have less "powers" here. (We could also add potential human reviewers based on https://www.mediawiki.org/wiki/Developers/Maintainers but I'd not do that to keep things simpler for the start.)
  • For V>0 && CR=0 (adjusted query #3 which also includes V=2)
    • we add potential human reviewers based on https://www.mediawiki.org/wiki/Developers/Maintainers (if we can identify stewardship for that repository) and then add a comment to request review from them for the changeset by this new contributor, to reach V>0 && CR!=0
    • we also send a weekly email to wikitech-l (past example) asking the community to please review the changeset, to reach V>0 && CR!=0
  • For V>0 && CR>0 (new query #5, while I fail to make Gerrit's query engine understand label:Code-Review=+1 to not list CR-1 changesets. Help!)
    • we add potential human reviewers based on https://www.mediawiki.org/wiki/Developers/Maintainers (if we can identify stewardship for that repository) and then add a comment to please make a final decision to either reach V>0 && CR>2 (merged), or to ask the contributor to make more changes (V>0 && CR<0).
    • we also send a weekly email to wikitech-l asking community members to decide on the changeset, to either reach V>0 && CR>2 (merged) or V>0 && CR<0 (needs more changes; see above).
  • For V>0 && CR<0 (new query #6)
    • we want to encourage the contributor to fix the remaining issues by adding a comment asking if the contributor needs any help (including pointers where to ask for more help outside of Gerrit?) and if the contributor plans to continue improving their changeset. This might not always work as volunteers can lose interest; we have less "powers" in that case and in theory we could ask others to chime in and improve their changeset but I'd not do that to keep things simpler for the start.
  • Conclusion: We ignore queries #1 and #2 for this workflow, and either use queries #3 to #6, or use some of the queries #3 to #6 to start small and expand.
  • Conclusion: We ignore queries #1 and #2 for this workflow, and either use queries #3 to #6, or use some of the queries #3 to #6 to start small and expand.

This is great workflow to start with!

So, then Query 3 becomes ownerin:newcomers status:open label:Verified>=1 label:Code-Review=0

Query 5 becomes ownerin:newcomers status:open label:Verified>1 label:Code-Review>1. Maybe it didn't work because we currently don't have newcomer patches with CR as +2?

A question and a concern - is the Gerrit reviewer bot not adding reviewers already to patches via the /Developers/Maintainers page? If so, then, maybe it is not adding those who aren't listed on the page. If this is the case, then should we not ping folks on IRC before adding them as a reviewer to a patch?

So, then Query 3 becomes ownerin:newcomers status:open label:Verified>=1 label:Code-Review=0

Correct.

Query 5 becomes ownerin:newcomers status:open label:Verified>1 label:Code-Review>1.

Small typo: Query 5 becomes ownerin:newcomers status:open label:Verified>1 label:Code-Review>0.

Maybe it didn't work because we currently don't have newcomer patches with CR as +2?

You won't see results for status:open label:Code-Review>1 often: CR+2 nearly always means merged. (Some repositories might be exceptions, don't know.)

A question and a concern - is the Gerrit reviewer bot not adding reviewers already to patches via the /Developers/Maintainers page?

The Gerrit reviewer bot and the Developers/Maintainers page have nothing to do with each other, as far as I know.
The documentation at https://www.mediawiki.org/wiki/Gerrit/Code_review#Finding_patches_to_review links to https://www.mediawiki.org/wiki/Git/Reviewers and does not mention https://www.mediawiki.org/wiki/Developers/Maintainers

If so, then, maybe it is not adding those who aren't listed on the page. If this is the case, then should we not ping folks on IRC before adding them as a reviewer to a patch?

I'm not sure I can follow. I might totally misunderstand, but the first sentence makes me think "If I do not know you then I cannot add you", the second sentence makes me think "Why would I ping folks about Gerrit on a horrible medium like IRC if I can ping folks about Gerrit in a slightly less horrible medium like Gerrit?" though I imagine that many people ignore all those noisy Gerrit spam notifications in general.

Answering my own help request about query #5:

  • For V>0 && CR>0 (new query #5, while I fail to make Gerrit's query engine understand label:Code-Review=+1 to not list CR-1 changesets. Help!)

Querying for ownerin:newcomers status:open label:Verified>=1 label:Code-Review>=+1 also lists a changeset with both a CR-1 and CR+1 vote. Instead, ownerin:newcomers status:open label:Verified>=1 label:Code-Review>=+1 -label:Code-Review<=0 works as expected.
This was the only of the queries listed in my comment T188244#4266947 that needed more adjustment.

If so, then, maybe it is not adding those who aren't listed on the page. If this is the case, then should we not ping folks on IRC before adding them as a reviewer to a patch?

I'm not sure I can follow. I might totally misunderstand, but the first sentence makes me think "If I do not know you then I cannot add you", the second sentence makes me think "Why would I ping folks about Gerrit on a horrible medium like IRC if I can ping folks about Gerrit in a slightly less horrible medium like Gerrit?" though I imagine that many people ignore all those noisy Gerrit spam notifications in general.

@Aklapper Without digging deep into the reviewer bot, still, my sense is that it automatically adds reviewers to the patches in Gerrit (perhaps pulls information from a list of reviewers on MediaWiki). If so, then why we need to manually add reviewers? If for some reason we do, should we not check with them before adding them to a patch?

my sense is that it automatically adds reviewers to the patches in Gerrit (perhaps pulls information from a list of reviewers on MediaWiki).

It pulls reviewer account info from the two sources that I listed https://phabricator.wikimedia.org/T188244#4269103

If so, then why we need to manually add reviewers? If for some reason we do, should we not check with them before adding them to a patch?

Wouldn't "Why do we add developers as subscribers to Phab tasks if developers are supposed to watch 'their' project tags, in theory?" be the same question? :P
We could check with them if you think that is a good use of time. Maybe you are luckier to get replies on IRC than I am.
Also see T196985: Find out which Git/Gerrit code repositories deployed on WMF sites have no reviewers/watchers in Gerrit?

First weekly message sent: https://lists.wikimedia.org/pipermail/wikitech-l/2018-June/090253.html
Happy to iterate but I don't see that in scope for this task. Anything else left to do here? Or close task as resolved?

As part of this task, we've developed a bot that helps track new developer patches, agreed on a workflow for reviewing patches, and begin to send email to Wikitech-I with a list of patches that need attention!