Page MenuHomePhabricator

A process to use available information about new and leaving technical contributors
Closed, DuplicatePublic

Description

Before we switched off korma.wmflabs.org in Feb 2017, its code_contrib_new_gone.html page listed

  • People with first submission in last three months
  • Contributors that have not contributed in the last 6 months

Traditionally information on "newbies" potentially in need of help has been available at https://www.mediawiki.org/wiki/Gerrit/Reports/Open_changesets_by_newbie_owner

I irregularly checked the first list and ping on patches without review, but

  • we do not use the available information in a process incorporated to our routines.
  • except for anecdotes, we do not know or understand why people move on.

We should think about

  • plan a process that allows more regular checking to get the attention of reviewers for patches by newcomers
  • potentially congratulate the contributor after merging the first patch and provide motivation and hints to try and find a second task to work on,
  • ask ourselves if we really want to wait six months for users leaving to get aware of them or whether to shorten that time frame,
  • ask users who leave / have left to give us quick feedback and help improving, potentially via Surveys.

To sort out:

  • Sample selection? Contact everybody, or if not, how to select the sample?
  • Legal issues: Data use; where and how to publish results (anonymously?)
  • Questions and format (quantitative, qualitative)?
    • exploratory := no propositions, open questions, hypothesis building = problem not clearly defined
    • descriptive := survey, who, what, where, how many, how much.
    • explanatory := causal=experiment; case study=how, why

Context: https://meta.wikimedia.org/wiki/Technical_Collaboration/Strategy mentions "offering a learning environment and growth path to technical contributors".

As per June 2017, this is currently blocked (stalled) on having similar (Gerrit) data on https://wikimedia.biterg.io (see T151161).


See Also:
T73357: Add a welcome bot to Gerrit for first time contributors
T64324: Visually indicate when a Phabricator user is new (Welcome culture)
T155676: Make it easier for newcomers to contribute to the MediaWiki project

Related Objects

Event Timeline

Aklapper created this task.Jun 7 2016, 3:26 PM
Restricted Application added a subscriber: Zppix. · View Herald TranscriptJun 7 2016, 3:26 PM
Qgil added a subscriber: Qgil.Jun 8 2016, 12:13 PM

I think this is a very interesting task. Are there any dependencies, or is it a matter of finding the time to dedicate to it?

No real dependencies, only "how to make it more attractive and findable for established developers to give quicker attention to newcomer patches" (maybe dashboard panels in a future Differential world?) and potentially "setting up a survey" at some point. But need to be clear beforehand who and how to actually use any received data and better research questions than "why do people leave?", to not run surveys for the sake of running surveys.

Aklapper raised the priority of this task from Low to Normal.Jul 23 2016, 2:26 PM

(Note to myself: I'd love to discuss this approach with a future WMF Developer Advocate.)

Aklapper updated the task description. (Show Details)Aug 26 2016, 2:09 PM

Three and a half late night thoughts:

  • Regarding new contributors in Gerrit: Greet new technical contributors in Gerrit via T73357. Dependency on demon or mmodell to get that implemented and tested. Or I can continue to do that manually and become more consistent and make it a more regular task, but I hope to avoid that.
    • To verify whether this is successful I wonder whether to just compare numbers per month provided by Bitergia (panel availability in a post-korma Kibana world is T132421) or whether to have a control group (not receiving such greetings) to compare to. To avoid the control group (it would requite the implementation to only add greeting message for every second new contributor), I've pinged Asheesh on an email from three years ago about a similar idea for Debian (might not get an answer; we'll see).
  • Regarding new contributors in Gerrit waiting for review: A weekly email to wikitech-l@ and/or engineering@. No external dependencies on other folks. Manually for the start, using data in korma, to list patches by new contributors that have not received feedback within four days (or such), their code repository, and asking who to look at them. Obviously, also re-list items the next week that still miss feedback. More awareness, hopefully also more reviews?
    • (In the long-run, also "hall of shame" stats about code repositories most listed in those emails?)
    • I was initially unsure about also listing non-deployed extensions, but if they show up for several weeks in a row that just means they are inactive and at some point 'mw:Gerrit/Inactive projects' applies anyway.
  • Leaving contributors: Survey idea (see above).
    • Potential technical dependency: No idea whether automatically sending via a script after a certain time frame of Gerrit inactivity is possible and would need someone (who) to code that. Contacting manually might be quite cumbersome. :-/
    • Needs proper survey design = Dependency on someone regularly creating surveys (maybe egalvezwmf?). Regarding questions:
      • If qualitative, the obvious one is "Why did you stop contributing?" but wondering if more specific ones could create better answers ("Did you face obstacles contributing? Which?") ...
      • If quantitative, stuff like "On a scale from 1 to 5 with 5 being best, how would you rate the onboarding documentation?"?
      • Or both (Rate; then explain your answer via text).
    • Very unclear to me how to analyze answers, and (if quantitative) how to compare over time whether there is progress (doing the math manually? hmm.).
    • Hence currently not convinced at all that this is a good task for a survey. (I'm happy to list reasons why people stop contributing to FOSS projects, in general.)

Regarding the six months time frame until listing gone contributors, I'd like to shorten that. Wondering if we have any "how many days are between two patchsets provided by one user" data and use the 90th percentile (or such) as a threshold value after how many days to contact inactive contributors, instead of listening to my guts. But I might make things too complicated.

(And still wondering how to measure the outcome of this exercise. 🚬,💼,🚀.)

Very good ideas overall! I think we can experimenting with the simple ones, and fine tune or cancel as we go.

As for receiving feedback from those leaving, I think it is an idea worth pursuing. An important aspect is to define who gets those requests for feedback. For instance, I would be very interested in answers from new developers who uploaded only one patch, or had a spark of activity and then left. Perhaps they were just scratching their own itch with a single bug they fixed (that could be one option in the survey, and that scenario would be totally fine) or perhaps something else happened that caused the departure, and that we could fix (patches ignored, frustrating reviewers, unclear documentation, messy code, more complex than initially imagined...)

Apart from the survey data itself, I think such request for feedback would have two other benefits:

  • those contributors would see that at least someone cared about them, and that in itself might be an encouragement to come back
  • those who reply would show us some interest in helping in improving the situation, and we could invite some of them to have a cheat and know more.

With some quantitative data coming from microsurveys easy to fill and process, and some qualitative information coming from selected interviews, changes are that we could make a big progress at identifying the problem we should focus on in order to improve newcomer retention.

By the way, do we or will we have a metric to measure retention of new contributors? Ultimately this would be the measurement to check whether these initiatives have an impact.

do we or will we have a metric to measure retention of new contributors? Ultimately this would be the measurement to check whether these initiatives have an impact.

We have at http://korma.wmflabs.org/browser/demographics.html .
We do not have (yet) on https://wikimedia.biterg.io/app/kibana#/dashboard/Git-Demographics (cf. T132421).

  1. T73357 - external. Need to nudge Mukunda a bit I guess. :P
  2. Weekly email to wikitech-l@ and/or engineering@: I'll start with that experiment.
  1. Surveys targetting people who have stopped providing technical contributions:

An important aspect is to define who gets those requests for feedback. For instance, I would be very interested in answers from new developers who uploaded only one patch, or had a spark of activity and then left.

I still wonder how to have that automated so it scales. Dragging data out of Gerrit or wikimedia.biterg.io/korma via some API and setting up some script somewhere, I guess.
@egalvezwmf: Can I inform myself on some wiki about available survey infrastructure?

Perhaps they were just scratching their own itch with a single bug they fixed (that could be one option in the survey, and that scenario would be totally fine) or perhaps something else happened that caused the departure, and that we could fix (patches ignored, frustrating reviewers, unclear documentation, messy code, more complex than initially imagined...)

There are general studies about reasons for departure, so we want to know which reasons commonly apply in case of Wikimedia.
This sounds like checkboxes in a survey, like

  • I was only interested in fixing the specific problems that I ran into myself, not into general contributions to Wikimedia projects.
  • I received no timely feedback.
  • The documentation how to provide patches was not helpful.
    • Which pages or sections, if you remember? ______
  • The code base was complicated or unclear.
  • Other reasons:
    • Mention them here: _______

Result would be knowing where to apply manpower (if we had any available)?

  • those contributors would see that at least someone cared about them, and that in itself might be an encouragement to come back

Well, we need to fix the code review process in T129067. If they come back and their next patch in the same repo has the same review lag... Meh.

Qgil added a comment.Sep 12 2016, 8:43 AM
  1. Surveys targetting people who have stopped providing technical contributions:

An important aspect is to define who gets those requests for feedback. For instance, I would be very interested in answers from new developers who uploaded only one patch, or had a spark of activity and then left.

I still wonder how to have that automated so it scales. Dragging data out of Gerrit or wikimedia.biterg.io/korma via some API and setting up some script somewhere, I guess.

Well, yes, but I wouldn't mind starting with a manual process, even if that would not cover all the contributors. Something more qualitative than quantitative.

@egalvezwmf: Can I inform myself on some wiki about available survey infrastructure?

https://meta.wikimedia.org/wiki/Surveys/Learning

There are general studies about reasons for departure, so we want to know which reasons commonly apply in case of Wikimedia.
This sounds like checkboxes in a survey, like

  • I was only interested in fixing the specific problems that I ran into myself, not into general contributions to Wikimedia projects.
  • I received no timely feedback.
  • The documentation how to provide patches was not helpful.
    • Which pages or sections, if you remember? ______
  • The code base was complicated or unclear.
  • Other reasons:
    • Mention them here: _______

Result would be knowing where to apply manpower (if we had any available)?

This looks like a very good start!

Aklapper raised the priority of this task from Normal to High.Sep 12 2016, 12:14 PM
Aklapper raised the priority of this task from Normal to High.
  1. T73357 - external.

Legoktm offered his help.

  1. Weekly email to wikitech-l@ and/or engineering@: I'll start with that experiment.

First email sent to wikitech-l@, based on korma data.

Qgil added a comment.Sep 15 2016, 7:37 AM

Very good! What is missing to consider this individual goal completed?

Very good! What is missing to consider this individual goal completed?

  • Define survey content (questions; also see T137214#2623287)
  • Define the sample who exactly to contact (exclude people who only added a single obvious Gerrit test commit? Only volunteer contributors?). Manually send such an email about every two weeks?
  • Define where to collect/publish anonymized answers (wiki page?)

(Having this stuff automatic can be a goal for a future quarter, as discussed.)

Contacting new and left contributors would be exploratory and no representative sample.
We'd like to learn where our contributors come from, what motivates them, and how we could improve the learning curve and retention.

I see three groups:

  • new contributors
    • How did you find out about contributing to Wikimedia code?
    • Why did you contribute a patch, and how did you choose which issue to solve by writing a patch?
    • On your way to writing your patch and putting it for review into Wikimedia Gerrit, what was the hardest part? Do you have any recommendations how to make it easier?
    • Have you been in contact with other members of the Wikimedia community outside of Gerrit and Phabricator? If yes, where?
    • Do you plan to continue contributing?
  • left 'established' contributors who were active for a longer timeframe (6 months?) and are not active anymore since 6 months (or 3?)
    • I really don't know what else we'd like to find out expect for "Why did you stop contributing?" See potential answers in T137214#2623287. Yes I've read T137214#2594021 :)
  • left 'drive-by' contributors who contributed less than 10 (?) patches within a short time frame (2 months?) and then became inactive again.
    • Questions are probably a mix between "new contributors' and 'left established contributors'

General:

  • Email addresses of contributors are available via Gerrit, commit messages, Differential/Diffusion.
  • To sort out: Legal issues: Data use; where and how to publish results (anonymously?)
  • Every month, contact three random users per group via email?

Note to myself: At GrimoireCon 2017, Ceph folks mentioned that they have a "Monthly Developer Survey" based on data (that we'd have in wikimedia.biterg.io ): You commit more than just drive-by stuff, so let's talk!

Aklapper updated the task description. (Show Details)Feb 13 2017, 1:51 PM
Aklapper changed the task status from Open to Stalled.EditedMar 21 2017, 12:21 PM

Since the move from korma.wmflabs.org to https://wikimedia.biterg.io, the "new contributors in Gerrit" part is effectively blocked on T151161.

(For completeness, "new contributors in Git" was covered in T151501 but has bugs like T157688).

Aklapper lowered the priority of this task from High to Low.Mar 27 2017, 11:01 AM
Aklapper changed the task status from Stalled to Open.Apr 18 2017, 4:52 PM
Aklapper raised the priority of this task from Low to Normal.
Aklapper moved this task from Backlog to May on the Developer-Advocacy (Apr-Jun 2017) board.
Aklapper changed the task status from Open to Stalled.Jun 1 2017, 1:25 PM
Aklapper moved this task from May to June on the Developer-Advocacy (Apr-Jun 2017) board.

Beta widget (see T151161) is not populated anyone hence not actionable again; setting back to stalled status

Aklapper updated the task description. (Show Details)Jun 9 2017, 11:33 AM

Just dropping a rough email draft here I had for contacting people via email, definitely needs improvement by someone who knows how to design proper surveys™:

Hi <username>!

<Who am I?>
I am contacting you as you have contributed code to Wikimedia projects in the past:
<Gerrit link to their contributions>

<List of questions we have to the contributor; to be defined>

<Explain how the data in the reply is used.>
Again thanks for your contributions and help!
<Link to more info?>
Nemo_bis updated the task description. (Show Details)Jun 11 2017, 7:09 AM
Zppix added a comment.Jun 11 2017, 8:15 PM

@Aklapper I think a more suitable template would be:

Hello, <Username or if provided First name>

I'm <your name/username>, and I'm contacting you today, since you have at one point contributed code to Wikimedia projects: <Link to contribs>

If you do not mind I would like to ask you some questions:

<questions go here>

<Explain data usage, with applicable policies regarding privacy>

Thank you for the contributions and help!

<More info if applicable>
Qgil added a comment.Sep 19 2017, 9:23 AM

Since @srishakatux is contacting new developers with surveys, I think it makes sense that it is also her who contacts potential lost developers for surveys as well. We are not going to work on this during this month (our first survey for new developers and our first quarterly report is more than enough), but I think we should commit to complete this task during Developer-Advocacy (Oct-Dec 2017).

Qgil added a comment.Oct 4 2017, 9:27 AM

We are working on a New Developers quarterly report (T167085). In this report we analyze metrics and surveys about new and probably gone developers. I think this task could be merged to T167085. If there is anything in the discussion of this task worth keeping apart, we should create a task for it.

What do you think?

Merging this task T137214 (which I never got into enough) into T176481.
As T137214 is kind of both about T176483 [left] and T167085 [joined] which is better covered by Srishti's work and plans.

I admit that T176483 [left] targets new developers instead of generally developers who left but the topic is 'similar enough' apart from situations already covered in research anyway (left university, got a job, got a different job, got children, changed interests, etc).