Page MenuHomePhabricator

Analyze retention rate of Junior Contributors using talk page features
Closed, ResolvedPublic

Description

We are curious to know what the 30-day retention rate of Junior Contributors using talk page features is. [1]

This information will be helpful for us in establishing a proper growth target for one of the key results for the project: Increase in the retention of junior contributors participating on talk pages by 5%. [2]

Done

  • A single graph that shows the 30-day Junior Contributor talk page retention rate for multiple cohorts [3]
  • A recommendation for % increase in the the 30-day Junior Contributor talk page retention rate we should target.

Notes
Precise definitions for terms like "Junior Contributors" and "retention" can be found here: https://www.mediawiki.org/wiki/Talk_pages_project/Glossary


  1. In T234046, we defined retention as, Contributors who come back to make an edit in any one of Wikipedia's 16 talk page namespaces within the 30 days that follow the "cool down" period. We have defined the "cool down" period as the 24 hours that follow a contributor's first edit during the study period as the starting point. //This metric definition, as well as others, now lives here: Talk pages project glossary
  2. Our current of a 5% increase over the lifespan of the project is a placeholder.
  3. "Multiple cohorts": to account for potential fluctuations in this metric, I thought it might be helpful to know the retention rate for multiple cohorts. There is likely a better approach to this, but I figured I'd enter it as a suggestion.

Event Timeline

ppelberg renamed this task from Analyze retention rate of "junior contributors" using talk page features to Analyze retention rate of Junior Contributors using talk page features.Nov 14 2019, 1:14 AM
ppelberg reassigned this task from ppelberg to MNeisler.
ppelberg updated the task description. (Show Details)

Here's the current year over year breakdown. This shows the 30-day Junior Contributor talk page retention for each month over the past three years.

junior_contributor_yoy_retention.png (1×1 px, 121 KB)

The retention rate has been increasing over the past three years. There was s significant jump from 2017 to 2018 and only slight increases seen between 2018 and 2019.

It might also be interesting to look into breakdowns by different types of user experience, wiki, and also the difference between short-term and long-term retention rates. For example, how many junior contributors return after 1 week compared to those who return after 1 month.

@ppelberg

I found a bug in my query that was double-counting some of the talk page contributors for each month. Below is the updated chart (please disregard the one posted above).

talk_contributor_retention_30day_yoy_plot.png (1×1 px, 98 KB)

Key initial observations:

  • The 30-day retention rate for Junior Contributors over the past three years has been fairly consistent with retention rates ranging from 19% to 24% each month.
  • There was an overall increase in retention rate between 2017 and 2018 while the retention rate between 2018 and 2019 has been fairly consistent with a few year over year increases and decreases (There has been a year over year increase from May to August this year with increases ranging from around 1 to 3%. In September 2019, there was a 4% decline).
  • As discussed today, this data conflates short-term retention and long-term retention levels because it includes both contributors who return within a couple days and those who return at day 30. It might be valuable to look at both short term and longer term retention to see how they differ.
LGoto triaged this task as Medium priority.Nov 26 2019, 5:52 PM

I updated the analysis to differentiate between short and long-term retention rates for junior talk page contributors. The below charts show the year over year trends from October 2016 through September 2019.

I broke down the 30-day period into a short-term retention period (% who made another talk page edit between 2 and 7 days after making their first edit during the study period) and long-term (% who made another talk page edit between 8 and 30 days after their first edit).

About 11-12% of unique junior talk contributors each month return within 7 days after their first edit. There is very little fluctuation in the number of users that return within 7 days after making an edit to a talk page, indicating that the users who return quickly to edit are not as impacted by seasonal trends we see in the longer-term retention rates.

talk_contributor_first_week_retention_yoy.png (1×1 px, 78 KB)

In contrast, a slightly higher percent (about 13 to 18% ) of unique junior talk page contributors each month return to make an edit 8 to 30 days after their first edit. This indicates that more users are likely to wait a week or so after making a talk page edit. There are similar fluctuations in these retention rates due to seasonal changes as seen in the 2 to 30 day retention period.

talk_contributor_8_30_day_retention_yoy.png (1×1 px, 96 KB)

Based on year over year trends seen for these retention rates (which have ranged from -4% to +3%), I would recommend targeting at least a 5-7% year over year increase in retention rate.

@ppelberg - Let me know if you have any questions.

Analysis Repo

I made some slight changes to the short and long-term retention charts to align more closely with the recommendations of the Product Analytics teams' user retention framework. While the framework is still currently in development, I wanted to try to be consistent with the proposed retention definition and use cases. However, there are minimal changes to the results and recommendations I posted above in case you've already started to review.

The current framework recommends setting an equal length retention period to the cohort period (when the action occurs). For example, second week editor retention would be defined as an editor that returns between 8 to 14 days after their first edit.

Based on this recommendation, I revised the short-term to two-week retention (Day 8 to Day 14) and long-term to the last two weeks of the month (Day 15 to Day 30).

Here are the updated charts based on these revised definitions:

Second Week Retention

About 9-14% of unique junior talk contributors each month return within the second week after their first edit, with an average retention rate of 11.1%. This is slightly lower than the retention rates identified for the first 7 days (days 2-7) with greater fluctuations. S

talk_contributor_2nd_week_retention_yoy.png (1×1 px, 95 KB)

Days 15-30 Retention

The longer-term (15-30) day retention rate is only slightly lower than the second-week retention rate. There is an average retention rate of 10.9% (ranging from 9% to 13% due to seasonal fluctuations).

talk_contributor_15_to_30_retention_yoy.png (1×1 px, 90 KB)

Overall, it looks like in the case of talk page user behavior there is minimal fluctuation in the behavior retention rates over the month (after the first seven days). It would be interesting to see how many of these users are posting a reply vs starting a new discussion to confirm if some of the higher rates we see in the first seven days are largely people coming back to post a reply.

It's great to be able to differentiate between contributors coming back to participate on talk pages in the "short" vs. "long term."

Documenting a few notes from our – @MNeisler and my – conversations

  • DECIDED: What retention period(s) will we use to measure retention?
    • For now, we are going to measure both "short-term" and "long-term" retention.
      • Where "short-term" retention means the percentage of contributors who make an edit to a talk page within 8 to 14 days of making their first edit on a talk page.
      • Where "long-term" retention means the percentage of contributors who make an edit to a talk page within 15 to 30 days of making their first edit on a talk page.
  • DECIDED: For both retention periods, "short" and "long term", we will target a 5% year-over-year increase in the retention of Junior Contributors participating on talk pages [1].

  1. Based on findings in T233889#5707351 and confirmed in T233889#5726190.

And here, capturing a few ideas for future exploration that surfaced during this task:

  • How does short and long term retention vary by wiki?
  • How does short and long term retention vary when Junior Contributors are bucketed into finer experience level segments (e.g. 1-4, 5-99 cumulative edits)?
  • To better understand what behavior drive retention, we might experiment with grouping users by the specific actions they take (e.g. starting new discussions, participating in existing ones).