Page MenuHomePhabricator

Measure newcomer retention after their first translation
Closed, DuplicatePublic

Description

As part of the success metrics identified for Content Translation version 2, we want newcomers to use the tool more regularly.

Measurement

We want to measure the percentage of (new) editors that completed their second translation in a month.

  • We want to have separate measurements for (a) "new editors", and (b) "existing editors" as a reference.
  • New editors are defined as users that created their account during the last 6 months (the initial criteria user for new editor experiences research). A user starting multiple translations, should be counted only once.
  • 30 days is the period considered for retention. For example, February stats will count the editors that published their second translation if they had made their first translation in the previous 30 days (a period covering part of January and February for the example).

Representation

An example representation is shown below, where the percentage of newcomers that complete their second translation each month increases, showing that the retention of users improves over time (less users abandon the tool after their first contribution).

newcomer-retention-small.png (311×504 px, 111 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Pginer-WMF renamed this task from Measure newcomer retention to Measure newcomer retention after their first translation.May 30 2018, 9:49 AM
Pginer-WMF moved this task from Needs Triage to CX2 on the ContentTranslation board.
Vvjjkkii renamed this task from Measure newcomer retention after their first translation to a0baaaaaaa.Jul 1 2018, 1:07 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
JJMC89 renamed this task from a0baaaaaaa to Measure newcomer retention after their first translation.Jul 1 2018, 2:08 AM
JJMC89 lowered the priority of this task from High to Medium.
JJMC89 updated the task description. (Show Details)
JJMC89 added a subscriber: Aklapper.

@Pginer-WMF it sounds like the definition of retention used here is "of the newcomers who completed a first translation in the previous 30 days, how many made a second translation in the same timespan?"

This is actually different from the definition we've been using in other places. New editor retention is: "of the users who complete a first edit in the 30 days after registration, how many make any edits in the second 30 days after registration?" We also picked something similar to define mobile retention, shifting the start from registration to the first mobile contribution: "of the users who complete a first mobile edit, how many make any mobile edits in the second 30 days after that first edit?".

If we made this metric analogous, it would be "of the newcomers/experienced editors who make a first translation, how many make any translations in the second 30 days after the first edit?". Using this would make things more consistent and understandable; do you have any objections? The one I can think of is that this means we would have a 60 day lag rather than a 30 day lag before seeing how an intervention impacted this metric, but I think even 30 days is too long for a feedback loop.

We've discussed this in the Product Analytics team, and it seems clear that we need a shorter metric to go along with two-month retention—maybe one week or two week retention. Would you be okay with tying this to the other retention metrics?

In T195949#4417646, @Neil_P._Quinn_WMF wrote:

@Pginer-WMF it sounds like the definition of retention used here is "of the newcomers who completed a first translation in the previous 30 days, how many made a second translation in the same timespan?"

...

If we made this metric analogous, it would be "of the newcomers/experienced editors who make a first translation, how many make any translations in the second 30 days after the first edit?". Using this would make things more consistent and understandable; do you have any objections? The one I can think of is that this means we would have a 60 day lag rather than a 30 day lag before seeing how an intervention impacted this metric, but I think even 30 days is too long for a feedback loop.

Your proposal seems good to me. My only question is about the last part ("how many make any translations in the second 30 days after the first edit?"). The "first edit" refers to the first translation or may also refer to other kinds of edits. I guess it refers to the translation, meaning that we look to the people that translated again after their initial translation experience.

We've discussed this in the Product Analytics team, and it seems clear that we need a shorter metric to go along with two-month retention—maybe one week or two week retention. Would you be okay with tying this to the other retention metrics?

It makes sense to have a metric for a shorter period. It is definitely practical to have a shorter cycle in getting the results. My only consideration is about the expected frequency of the translation activity. I expect, even for an ideal "retained" editor to create new articles by translating less often than doing regular edits. I don't know if that can produce some kind of noise in the data. Is that taken into account on metrics specific to article creation (which would be a more similar activity)?

In any case, I'd be happy to see these numbers (just more inclined to a 2 week period rather than 1 week as per the above).

Hi @Neil_P._Quinn_WMF and @Pginer-WMF, I want to make sure I understand the definition of the retention here correctly:

Let's say we want to compute the retention rate among all users who started a translation in June 2019, i.e. among all users who started a translation in June (regardless the current status of the translation -- draft, published or deleted), the proportion of users that start another translation in their second 30 days (a period covering part of July and August).
If there's a user A who start 3 translations on 2019-06-01, 2019-06-15 and 2019-07-10, we will first check A's account creation date, if 2019-06-01 - (account creation date) > 6 months, then we call A an experienced user, otherwise a new user. Then we call A's first translation date in June (2019-06-01) as the "birth date", and since 2019-07-10 falls in the second 30 days of A's birth date, we identify A as a retained user.

Do we care about published translation only? Or do we want to count all started translations like the example above?

Do we care about published translation only? Or do we want to count all started translations like the example above?

We were considering publishing a translation only. Publishing a translation is what we were consider to produce a valuable) contribution. Thus, here we proposed to measure the people that create two new articles in 30 days using Content translation.

If you think that measuring started translations makes more sense for some reason, we can consider it. In any case, other metrics such as T194647 help to clarify how many started translation become published translations.

kzimmerman added subscribers: pau, chelsyx, kzimmerman.

Unassigning @chelsyx because she's moving on from the Foundation.

@pau, we should touch base on what you'll need for our Product annual plan retrospective.

LGoto lowered the priority of this task from Medium to Low.May 18 2020, 4:17 PM