Page MenuHomePhabricator

Investigate workflow to identify recently active new volunteer technical contributors in Gerrit (to potentially invite them to upcoming Hackathons)
Open, LowPublic

Description

  • Need to define a threshold (number of patches?) / time frame, to avoid covering short-term drive-by involvement.
  • Need to check what kind of queries https://wikimedia.biterg.io could offer.

Event Timeline

Aklapper created this task.Mar 12 2018, 2:47 PM
Aklapper triaged this task as Low priority.
Aklapper renamed this task from Investigate workflow to identify recently active new volunteer technical contributors (to potentially invite them to upcoming Hackathons) to Investigate workflow to identify recently active new volunteer technical contributors in Gerrit (to potentially invite them to upcoming Hackathons).Mar 21 2018, 9:58 PM
Aklapper moved this task from Backlog to March on the Developer-Advocacy (Jan-Mar-2018) board.
Aklapper added a comment.EditedMar 21 2018, 10:20 PM

Assuming that the deadline for Hackathon scholarships is likely about 3 months before the event takes place and that gathering the author names should take place about 4 months before the event, I deliberately chose newcomers who became first active 12 to 8 months ago with 5 or more contributions in that timeframe and then check if they also contributed 5 or more contributions within the 8 to 4 months before the event. Feel free to change these numbers (FYI, applying these very numbers and dates for the Barcelona May 2018 Hackathon, I get six human names).

The steps would be (skip the bullet points and scroll down if you don't want to perform these steps yourself):

  • go to https://wikimedia.biterg.io/app/kibana#/dashboard/C_Gerrit_Demo
  • in the upper right corner, click the time frame
  • click "Absolute" on the left
  • for "From", enter a date ~12 months ago
  • for "To", enter a date ~8 months ago
  • Click the "go" button
  • in the "New Authors per Organization" pie chart, click the "Independent" part to apply a filter that displays only independent authors
  • in the "New Authors" widget, click the "Reviews (started)" column header on the right, to sort the list by number of patchsets (highest number first)
  • in the "New Authors" widget, get all names in the "Author" column which have 5 or more 'reviews'/'changesets': Either by pressing the Ctrl key in Firefox and marking the column with author names, or by installing some add-on in Chromium (untested), or by exporting to CSV via the link at the bottom of that widget and then removing all the other (non 'author name') columns and remove the rows in which the number of patchsets is less than 6.
  • construct a query string from those author names; format is author_name:"ABC" OR author_name:"XYZ"
  • copy that search string into your computer's clipboard
  • go to https://wikimedia.biterg.io/app/kibana#/dashboard/Gerrit
  • in the upper right corner, click the time frame
  • click "Absolute" on the left
  • for "From", enter a date ~8 months ago
  • for "To", enter a date ~4 months ago
  • Click the "go" button
  • in the "New Authors per Organization" pie chart, click the "Independent" part to apply a filter that displays only independent authors. Note: Might be a different color than before.
  • paste that search string from your computer's clipboard into the text search field on top by replacing the * and press Enter
  • in the "Submitters" widget, click the column header "# Changesets" if the names are not already sorted by that number
  • Look at the names in the "Submitter" column which have again 5 or more 'changesets'/'reviews'.
  • Get the email addresses of those users either by checking the database behind wikimedia.biterg.io (if you have access; if not check who has), or by entering their names in the search field on https://gerrit.wikimedia.org/ and wait for the autocomplete / search proposals to get displayed which include the email address
  • Contact those people via e-mail (?), explain why, make them aware of the event and the scholarship, why it could be interesting for them. With links to more info.

This is the simplest approach I can come up with, which can be performed by anyone. 'Simplest' because the flaw is that there could be someone who was only active in a time frame that's 'on both sides' of the '8 months' threshold and close to the '8 months' threshold. That sounds negligible.

Nota bene: Current bugs: The results do not exclude bot accounts until these indices are merged, and that some authors might be shown twice.

@Aklapper Looks good! If this will be used by other folks (such as event organizers, scholarship committee members, etc.), a few minor instructions steps could be added:

  • In the context of CSV, how to format it, sort the data by the number of patchsets, remove the rows in which the no of patchsets is less than 5, etc.
  • Perhaps, a quick way (e.g. via shell script command) to construct query string with "author_name."

Also, where are these instructions ultimately going to live? IMO, somewhere on /Hackathons/Handbook for organizers to be able to find/ follow this process easily...

  • In the context of CSV, how to format it, sort the data by the number of patchsets, remove the rows in which the no of patchsets is less than 5, etc.

Added by editing my comment above.

  • Perhaps, a quick way (e.g. via shell script command) to construct query string with "author_name".

Line-endings differ on systems and given the variety of systems people use, plus potentially fiddling with escaping the " character, I'd leave that to someone else.

Also, where are these instructions ultimately going to live?

I'd love to defer that to @Rfarrand who has a better overview (plus might have input why my idea does not make sense or such). :)

Aklapper reassigned this task from Aklapper to Rfarrand.

For the time being, assigning to @Rfarrand as there is currently nothing left to do here for me.

Erica mentioned that we have definitions for active editors (to compare with something, as my criteria is random). Looking for a valid metric is possibly something Research people could help with, if we do not feel comfortable with random numbers.

...and bd808 brought up the idea to potentially contact Programs for things that work which could maybe also work in the Tech space.

Rfarrand removed Rfarrand as the assignee of this task.May 24 2018, 11:10 PM

assign to @Aklapper and not me, correct?

@Rfarrand: I originally assigned this to you in T189496#4090851 because I wonder where this step should be documented (in the handbook?), to be performed before the scholarship application deadline of an upcoming Hackathon.

Rfarrand claimed this task.Jun 21 2018, 4:18 PM

Add a note: if this does not make any sense to you / you can't complete it please let us (andre) know and we can try to clarify / improve the documentation.

@Rfarrand: Maybe https://phabricator.wikimedia.org/T189496#4070990 and "feel free to contact https://www.mediawiki.org/wiki/User_talk:AKlapper_(WMF) or https://www.mediawiki.org/wiki/User_talk:SSethi_(WMF) if you'd like some help" could get linked from a place like https://www.mediawiki.org/wiki/Hackathons/Handbook/Manage_participants#Scholarships ? But I don't know who is 'responsible' for announcing scholarships a few months before events take place...

T189496#4070990 above is now outdated because in the meantime, T151161 and T184907 got fixed and some Kibana UI upgrades have taken place.

Updated steps as per late December 2018:

Assuming that the deadline for Hackathon scholarships is likely about 3-4 months before the event takes place and that gathering the author names should take place about 4 months before the event, I deliberately chose newcomers who became first active 14 to 8 months before the event with 5 or more contributions in that timeframe and then check if they also contributed 5 or more contributions within the 8 to 4 months before the event.

The steps would be (skip the bullet points and scroll down if you don't want to perform these steps yourself):

  • go to https://wikimedia.biterg.io/
  • click "Community > Demographics" in the top bar
  • in the upper right corner, click the time frame
  • switch from "Relative" to "Absolute"
  • for "From", enter a date ~12 months before the event
  • for "To", enter a date ~8 months before the event
    • (note: the resulting list will include newcomers first active 14-8 months ago, but only lists newcomers who contributed at least one patch 12-8 months ago)
  • Click the "go" button
  • In the "Organizations" list widget, hover over the "Independent" entry and click the + magnifier icon to apply a filter that displays only independent authors
  • in the "Last Attracted Developers" list widget, click the "Contributions" column header, to sort the list by number of patchsets (highest number first)
  • in the "Last Attracted Developers" widget, get all names in the "Author" column which have 5 or more 'contributions': Either by pressing the Ctrl key in Firefox and marking the column with author names, or by installing some add-on in Chromium (untested), or by exporting to CSV via the link at the bottom of that widget and remove the rows in which the Contributions column is less than 6.
  • construct a query string from those author names; format is author_name:"ABC" OR author_name:"XYZ"
  • copy that search string into your computer's clipboard
  • click "Gerrit > Overview" in the top bar
  • in the upper right corner, click the time frame
  • switch from "Relative" to "Absolute" (if needed again)
  • for "From", enter a date ~8 months before the event
  • for "To", enter a date ~4 months before the event
  • Click the "go" button
  • paste that search string from your computer's clipboard into the text search field on top by replacing the * and press Enter
  • in the "Submitters" widget, click the column header "# Changesets" if the names are not already sorted by that number
  • Look at the names in the "Submitter" column which have again 5 or more 'changesets'/'reviews'.
  • Get the email addresses of those users either by checking the database behind wikimedia.biterg.io (if you have access; if not check who has), or by entering their names in the search field on https://gerrit.wikimedia.org/ and wait for the autocomplete / search proposals to get displayed which include the email address
  • Contact those people via e-mail (?), explain why, make them aware of the event and the scholarship, why it could be interesting for them. With links to more info.