Page MenuHomePhabricator

Measure newcomer retention after their first translation
Closed, ResolvedPublic

Description

As part of the success metrics identified for Content Translation version 2, we want to measure the percentage of (new) editors that completed their second translation in a month.

Measurement

  • We want to have separate measurements for (a) "new editors", and (b) "existing editors" as a reference.
  • New editors are defined as users that created their account during the last 6 months (the initial criteria user for new editor experiences research). A user starting multiple translations, should be counted only once.
  • 30 days is the period considered for retention. For example, February stats will count the editors that published their second translation if they had made their first translation in the previous 30 days (a period covering part of January and February for the example).
  • Data should be filterable by following dimensions: platform, wiki, and activity

As part of this task, we will look into adding the resulting retention data to the Content Translation key metrics dashboard once the query is complete so it can be reviewed and filtered within the dashboard using the current filters (wiki, platform, etc).

Representation

An example representation is shown below, where the percentage of newcomers that complete their second translation each month increases, showing that the retention of users improves over time (less users abandon the tool after their first contribution).

newcomer-retention-small.png (311×504 px, 111 KB)

Event Timeline

kzimmerman triaged this task as Medium priority.Jun 28 2019, 6:48 PM
kzimmerman subscribed.

Assignment is pending finalization of analyst point people to teams

Pginer-WMF raised the priority of this task from Medium to High.Aug 29 2019, 10:19 AM
nshahquinn-wmf lowered the priority of this task from High to Medium.EditedOct 28 2019, 4:47 PM

I'll be working on this during the current quarter, but it doesn't need to be done immediately, so setting the priority as normal.

Based on discussions with @Pginer-WMF, this is a lower priority than T231316 and T250378, so we probably won't have a chance to do it until the next quarter.

MNeisler renamed this task from Measure newcomer retention after their first translation for selected small wikis to Measure newcomer retention after their first translation.Mar 9 2023, 3:55 PM
MNeisler updated the task description. (Show Details)

The current scope of this task overlaps with the scope identified in T195949 so I've merged the tasks and updated to description to reflect the updated scope.

Next steps:

  • Determine if the proposed retention definition still aligns with current retention definitions used by other product teams and by Product Analytics.
  • Complete query to pull retention data and create an aggregate dataset
  • Share initial findings and key results
  • Make the data viewable and filterable within the Content Translation Dashboard. Note: To reduce timeout errors, we may need to create an ETL job that will periodically calculate these metrics and save them to the Data Lake. This will be done in T287306

@Pginer-WMF
I've provided an initial summary of my results in this document for review and discussion.

Note: The current analysis looks at a couple of different retention definitions so we can compare differences and determine what would be valuable to track in a dashboard. I'll plan to update this as we refine the retention definition and identify any other exploratory analyses that would be worthwhile.

@Pginer-WMF
I've updated the Content Translation key metrics dashboard to include a snapshot of new translator retention data viewable by various filters (platform, wiki, activity). This is shown under the new "Translators" tab of the dashboard. Please review the new charts and let me know if you have questions or suggested changes.

A few notes:

  • The data in the dashboard currently reflects the following retention definition, which aligns with the product core metrics definition.

Of the editors who made at least one published translation in their first 30 days after creating an account, the proportion who returned to publish a translation 30 days to 60 days after their first translation. This does not exclude translations that may have been reverted or deleted.

Note: I think this definition is useful because it allows us to compare new translator retention rates to all new editor retention rates shown in the editors superset dashboard. In addition, the exploratory analysis indicates that the majority of new translators do complete their fist translation 30 days after registering. However, if we find this definition too restrictive, the time from registration to first published translation can be extended.

  • Filters can be applied to review retention rates by the user's first published translation activity, platform and language. Note: These filters currently are applied to describe the user's first published translation after registering. A user is considered retained if they publish any type of translation on any platform in their second 30 days. However, this can be adjusted if we are interested in understanding how many users return to complete the same translation activity.
  • Data currently reflects a snapshot of data recorded from January 2019 through January 2023 and will require manual updates to review more recent months. A task to create a job to automatically update the dataset monthly is planned to be completed as part of T287306.
  • There is a limited number of new users, whose first published translation expanded an article on mobile web compared to creating a new article on desktop. As a result, there are significant fluctuations in monthly and per-wiki retention rates when you apply filters to limit the data to either the "expand an article" translation activity or "mobile web" platform and a lot of resulting noise in this data. As section translation deployment is expanded, we should have more data to understand retention rates for this activity. Otherwise, we may want to relook at extending the definition of new translator retention to users that publish their first translation 6 months after registration instead of their first 30 days to see if that increases our sample size.

For an overview of the results, please also see the new translator retention results doc.

Thanks a lot, @MNeisler. The report is super useful already and has helped to identify some questions to observe and research about our users.