Page MenuHomePhabricator

Calculate adoption of Replying tool (Beta Features)
Closed, ResolvedPublic

Description

Prior to deploying the Replying tool to all volunteers at our four partner wikis (T249394), we would like to know how the tool is being used/adopted as a Beta Feature.

Reason being: we would like to decide whether the jump in usage we can anticipate between the feature being available as an opt-in Beta Feature and it being available as an opt-out user Preference is large enough to warrant us taking an intermediate deployment step. [2]

To help evaluate the "usage/adoption" we would like to understand the following "Adoption metrics."

Timing

Q4/2019-2020

Adoption metrics

For each of our four partner wikis – Arabic, Dutch, French and Hungarian – we would like to understand the metrics described below. These metrics are sorted in order of priority: highest priority = 1; lowest priority = 5.

1. From 31-March-2020 onward, how many people have used the Reply tool?

  • How many people have made 1 edit w/ DiscussionTools?
  • How many people have made 2-5 edits w/ DiscussionTools?
  • How many people have made 5-10 edits w/ DiscussionTools?
  • How many people have made 10+ edits w/ DiscussionTools?

2. From 31-March-2020 onward, how often are people using the Reply tool to make talk page edits?

  • How many people have made ≥1 edit w/ DiscussionTools on 1 day?
  • How many people have made ≥1 edit w/ DiscussionTools on 2-5 different days?
  • How many people have made ≥1 edit w/ DiscussionTools on 5-10 different days?
  • How many people have made ≥1 edit w/ DiscussionTools on 10+ days different days ?

3. From 31-March-2020 onward, how many people have had access to the Reply tool ?
Graphed over time, segmented by wiki

  • How many people have explicitly [3] turned on the DiscussionTools Beta Feature?
  • How many people had the DiscussionTools Beta Feature turned on for them? [4]
  • How many people have turned off the DiscussionTools Beta Feature? [5]

4. How many people should we expect to try the Replying feature when it is turned on as an opt-out user preference four all users, at our four partner wikis?

  • Upper bound: number of people who have made at least 1 edit, in any namespace, in the previous 30 day period
  • Lower bound: number of people who have made at least 1 edit in a talk namespace in the previous 30 day period

5. From 31-March-2020 onward, how often are people using the Reply tool to make talk page edits?
Of the people who have made at least one edit with the Reply tool, how many of these people have made >5%, >10%, >25% and >50% - of their total talk page edits using the tool?

6. ⚠️ON HOLD PER T249386#6233308: From 31-March-2020 onward, how has peoples' usage of the Reply tool changed over time?

  • Of the people who tried the Reply tool in Week 0, what percentage of these people made at least 1 edit with the tool at any point during Week 1, Week 2, Week 3, Week 4, Week 5, etc.

Open questions

  • Retention: how – if at all – should the way we calculate retention account for the fact there may be some weeks where people do not make any talk page edits?

Done

  • The "Adoption metrics" above have been calculated
  • Once the initial report is created, ideally, the Editing Team will be able to re-run these queries independent of Product-Analytics.

  1. https://www.mediawiki.org/wiki/Talk_pages_project/replying#Step_3:_User_Preference_(opt-out)
  2. Intermediate deployment step: e.g. deploying the feature as an opt-out user Preference to 50% of contributors on target wikis.
  3. "Explicitly": meaning they did not have the following preference checked: Automatically enable all new beta features
  4. "Turned on for them": meaning they did have the following preference checked: Automatically enable all new beta features
  5. This should include everyone, regardless of how the feature became available to them (e.g. whether they turned it on explicitly or whether it was automatically enabled for them)

Related Objects

Event Timeline

LGoto triaged this task as Medium priority.Apr 13 2020, 4:45 PM

Update: 15-April
Jotting down some notes from the meeting @Mayakp.wiki and I had today...

As this task is currently written, we will evaluate how "...intensely are people using version 1.0 of the Replying feature" by looking at how many people have made a certain number of edits with the Reply tool.

In looking at the actual usage of the tool over the past 2 weeks, we've noticed there are some people who use the tool to make 10+ edits on a single day on a test page to try it out and do not use it again. There are other people, who use the tool a couple times on consecutive days, in some cases, nearly everyday after first trying it.

The observation above has led us to think a better measure of how "intensely" the feature is being used to be the following:

  • How many people have made ≥1 edit w/ DiscussionTools on 1 day?
  • How many people have made ≥1 edit w/ DiscussionTools on 2-5 different days?
  • How many people have made ≥1 edit w/ DiscussionTools on 5-10 different days?
  • How many people have made ≥1 edit w/ DiscussionTools on 10+ days different days ?

The only people who would be included in this analysis are people who have made their first edit with DiscussionTools at least 10 days before the analysis is run.

Open questions

  • @Mayakp.wiki: is it possible for us to calculate the metrics above? If so, how does the level of effort to compute the metrics above compare to the level of effort required to compute the metrics currently listed in the task description?
  • @Mayakp.wiki: assuming we can calculate the metrics above, how soon would we be able to calculate considering the fact that MediaWiki History is updated is updated 1x/month.

Per my discussion with Peter today, due to improvisations in the deployment plans of the Replying feature, this task will be Retitled and will be completed after Replying V2.0 is deployed as an Opt-in Beta Feature on the target wikis.

Moving back to Current Quarter - Product Analytics workboard

ppelberg renamed this task from Calculate Replying v1.0 Beta Feature adoption to Calculate adoption of Replying tool (Beta Features).May 2 2020, 12:46 AM
ppelberg updated the task description. (Show Details)
ppelberg updated the task description. (Show Details)

This task will begin after the deployment of Replying V2.0 as Opt-in Beta Feature T251654 and its data-QA.

Next step

  • @ppelberg to post comment and update task description with outcomes from the meeting @Mayakp.wiki, @MNeisler and he [i] had on 6-May RE measuring stickiness (read: retention) instead of intensity (T249386#6060130), as is currently specified.

i. It feels silly to refer to myself in the third person, but hey, who says Phabricator can't be phun.

Next step

  • @ppelberg to post comment and update task description with outcomes from the meeting @Mayakp.wiki, @MNeisler and he [i] had on 6-May RE measuring stickiness (read: retention) instead of intensity (T249386#6060130), as is currently specified.

I've updated the task description to include measuring retention as a proxy for the tool's stickiness.

Considering this task is intended to help us decide how broadly the Reply tool should subsequently be deployed, I think it's important for us to know, absolutely, how many people have tried the tool at the moment we are making this decision. As such, this task is still asking for:

  • How many people have made __ edit(s) w/ DiscussionTools?

Calculating retention
Notes from conversation with @Mayakp.wiki + @MNeisler:
In order to calculate retention, we will need to use the MediaWikiHistory table. This means:

  • We are only able to look at previous months' data. Said another way: we will not be able to look at data from the month during which the analysis is being done.

Task description update
Updating the task description to show the priority of the metrics we are seeking.

Assigning to Megan.
This is (tentatively) planned to be completed in early July 2020 per weekly 1:1 with Peter and Megan.

Assigning to Megan.
This is (tentatively) planned to be completed in early July 2020 per weekly 1:1 with Peter and Megan.

Perhaps this is implicit in the above, but to be sure: work on this can, ideally, start as soon as possible. AFAIK: nothing is blocking work starting on this.

Open questions

  • Retention: how – if at all – should the way we calculate retention account for the fact there may be some weeks where people do not make any talk page edits

We have not yet developed an approach for how we can calculate the retention of the tool while also accounting for the fact there may be some weeks where people do not make any talk page edits.

As such, instead of trying to understand how sticky the tool is by looking at retention [i], we're going to try to understand the tool's stickiness by calculating the following metric: "Of the people who have made at least one edit with the Reply tool, how many of these people have made >10%, >25% and >50% of their total talk page edits using it after turning on the feature?"

Perhaps starting with this approach will encourage an answer to the "Open question" to emerge.

The above is now represented in the task description.


i. Retention is written as "2. From 31-March-2020 onward, how has peoples' usage of the Reply tool changed over time?" in the task description.

Open questions

  • Retention: how – if at all – should the way we calculate retention account for the fact there may be some weeks where people do not make any talk page edits

We have not yet developed an approach for how we can calculate the retention of the tool while also accounting for the fact there may be some weeks where people do not make any talk page edits.

As such, instead of trying to understand how sticky the tool is by looking at retention [i], we're going to try to understand the tool's stickiness by calculating the following metric: "Of the people who have made at least one edit with the Reply tool, how many of these people have made >10%, >25% and >50% of their total talk page edits using it after turning on the feature?"

Update to approach
Documenting the outcomes of today's conversation with @Mayakp.wiki and @MNeisler:

  • To measure the tool's stickiness [i] we're going to calculate what's described in T249386#6060130 rather than calculating what's currently written in the task description: "Of the people who have made at least one edit with the Reply tool, how many of these people have made >10%, >25% and >50% of their total talk page edits using it after turning on the feature?"
    • Reason: the % metric (described above) is likely to be a noisier measure as there could be cases where the following people end up looking the same in the data. In this case, both Person A and Person B would end up in the >50% bucket: Person A: made a total two edits to talk pages, one of which was with the Reply tool; Person B: made a total of 150 talk page edits, 75 of which were with the Reply tool.

Notes

  • To measure usage on distinct days (as is being asked for above), we will depend on MediaWiki history data. This is data that is updated 1x/month. Meaning: we will not be able to include the current month's data in any analysis we do.
  • All of the above is now reflected in the task description.

i. Stickiness: "Do people value the tool enough to use it on subsequent days after first trying it?"

@ppelberg Here is the current report and repo on the adoption metrics.

A couple key takeaways/notes:

  • I identified several data issues with the PrefUpdate data, which was needed to calculate the number of users the enabled or disabled the beta feature, including missing data in May (T253151) and duplicate events (T218835). I'm looking into further but in the meantime those numbers are likely skewed.
  • Since deployment of the reply tool to the end of May, a total of 258 users have successfully made an edit with the reply tool. The majority of reply tool users (71.2%) have made more than 1 edit using the tool with most (35.3%) making between 2 to 5 edits.
  • On Arabic and Hungarian Wikis, a large portion of their reply tool users (36.2% on Arabic and 45.5% on Hungarian Wikipedias) made over 10 edits using the reply tool. These two wikis also had the highest proportion of users that made edits on distinct days.

Let me know if you have any questions or suggested revisions.

@ppelberg Here is the current report and repo on the adoption metrics.

Thank you creating this, @MNeisler.

Below are the refinements you, @Mayakp.wiki and I talked about making yesterday:

1. Are we able to add additional buckets to the following metrics so we can develop a more granular understanding of how people are using the tool?

  • A) "Reply users distinct days of activity overall" | link
    • ADD "10-20 days," "30-40 days," and "40+ days" buckets to the existing, "1 day," "2-5 days," and "5-10 days" buckets?
  • B) "Overall proportion of user talk page edits made with reply tool" | link
    • ADD "over 50%," "over 75%" and "over 90%" buckets to the existing "under 5%," "over 5%," "over 10%," etc. buckets.
  • C) Number of users that made reply edits by edit count group | link
    • ADD "10-20 edits," "30-40 edits," and "40+ edits" buckets to the existing, "1 edit," "2-5 edits," and "5-10 edits" buckets.

2. Are we able to know how many people turned off the tool after trying it?

  • A) "Total Number of Users that Explicitly turned on or turned off the Beta Feature" | link
    • ADD a metric that helps us know how many distinct people explicitly turned off the DiscussionTools Beta Feature at any point after making at least one edit with the Reply tool?

3. Are we able to know how much editing experience the people who have used the Reply tool have? [i.]

  • A) ADD a metric that helps us know: of the people who have made ≥1 edit with the Reply tool how many total talk page edits have they made since creating their account? Buckets: <10 total talk page edits, 10-100 talk page edits, 100-500 talk page edits, >500 talk page edits.

4. Are we able to know many distinct people explicitly turned the Reply tool on/off?

  • A) MODIFY the metrics that depend on PrefUpdate sequence such that they help us understand the following:
    • How many people explicitly turned the tool on once?
    • How many people explicitly turned the tool off once?

Note: the refinements above are ordered by priority. "1A" = highest priority; "4" = lowest priority.


i. Two things: 1) We didn't talk about this metric yesterday (I'm stating this explicitly so not as to imply we did!) and 2) Please tell me if you think this metric is deserving of its own ticket considering we did not scope it as part of this task.

i. Two things: 1) We didn't talk about this metric yesterday (I'm stating this explicitly so not as to imply we did!) and 2) Please tell me if you think this metric is deserving of its own ticket considering we did not scope it as part of this task.

@ppelberg Yes, can we break out this metric (#3) in to its own ticket? It's different enough from the scope of this current ticket that it would be good to track separately.

@ppelberg Yes, can we break out this metric (#3) in to its own ticket? It's different enough from the scope of this current ticket that it would be good to track separately.

For sure. That ticket is here: T257252

@MMiller_WMF raised a good point: the data being divided into buckets that have different "widths" [i], makes these segments difficult to compare. [ii.]

In response to the above, @MNeisler shared an alternative approach which we're going to take: revise the charts to show equal bucket width and pull-out the 1-day and 1-edit numbers as separate pieces of information.

This "1-day" and "1-edit" granularity is important for they are helpful for detecting churn, a key consideration in determining the extent to which people value the tool (the question this task is intended to help us answer).

Adjustments

With all of the above in mind, we are going to adjust the metrics as follows:

  • A) Number of users that made reply edits by edit count group | link
    • CREATE new data point that shows the number and percentage of people (overall and by wiki) who made 1 edit with the Reply tool
    • ADJUST the existing buckets to be in even increments of 10 edits [iii]. E.g. "1-10 edits," "11-20 edits," "21-30 edits," "31-40 edits," "41-50 edits," "50+ edits."
  • B) "Reply users distinct days of activity overall" | link
    • CREATE new data point that shows the number and percentage of people (overall and by wiki) who used the Reply tool on 1 day.
    • ADJUST the existing buckets to be in even increments of 10 days [iii]. E.g. "1-10 days," "11-20 days," "21-30 days," "31-40 days," "41-50 days," "51-60 days."
  • C) "Overall proportion of user talk page edits made with reply tool" | link
    • CREATE new data point that shows the number and percentage of people (overall and by wiki) who used the Reply tool for under 5% of their total talk page edits during the period.
    • ADJUST the existing buckets to be in even increments of 10 percent [iii]. E.g. "1-10 percent," "11-20 percent," "21-30 percent," "31-40 percent," "41-50 percent," "51-60 percent," "61-70 percent," "71-80 percent," "81-90 percent" and "91-100 percent."

i. E.g. right now, we have the "Reply users distinct days of activity overall" metric bucketed as follows: 1 day / 2-5 days /5-10 days / 10+ days
ii. Dividing the data into different buckets would make sense had we had a clear reason for doing so, as is done with editor experience cutoffs which has become a convention that is followed across teams.
iii. @MNeisler: if you think the chart will still be clear and meaningfully with a more granular bucket width, like 5 edits/days, let's go with that.

Adjustments

With all of the above in mind, we are going to adjust the metrics as follows:

  • A) Number of users that made reply edits by edit count group | link
    • CREATE new data point that shows the number and percentage of people (overall and by wiki) who made 1 edit with the Reply tool
    • ADJUST the existing buckets to be in even increments of 10 edits [iii]. E.g. "1-10 edits," "11-20 edits," "21-30 edits," "31-40 edits," "41-50 edits," "50+ edits."
  • B) "Reply users distinct days of activity overall" | link
    • CREATE new data point that shows the number and percentage of people (overall and by wiki) who used the Reply tool on 1 day.
    • ADJUST the existing buckets to be in even increments of 10 days [iii]. E.g. "1-10 days," "11-20 days," "21-30 days," "31-40 days," "41-50 days," "51-60 days."
  • C) "Overall proportion of user talk page edits made with reply tool" | link
    • CREATE new data point that shows the number and percentage of people (overall and by wiki) who used the Reply tool for under 5% of their total talk page edits during the period.
    • ADJUST the existing buckets to be in even increments of 10 percent [iii]. E.g. "1-10 percent," "11-20 percent," "21-30 percent," "31-40 percent," "41-50 percent," "51-60 percent," "61-70 percent," "71-80 percent," "81-90 percent" and "91-100 percent."

@ppelberg
I've made the adjustments to the buckets as specified above and also updated data through the end of June as that data is now available in mediawiki_history. Please see the new notebook.

Remaining items to still complete prior to closing out this task:

  1. "Total Number of Users that Explicitly turned on or turned off the Beta Feature"

ADD a metric that helps us know how many distinct people explicitly turned off the DiscussionTools Beta Feature at any point after making at least one edit with the Reply tool?

  1. MODIFY the metrics that depend on PrefUpdate sequence such that they help us understand the following:

How many people explicitly turned the tool on once?
How many people explicitly turned the tool off once?

Per discussion above, I will look at the editing experience of the reply tool users as part of T257252

Per discussion above, I will look at the editing experience of the reply tool users as part of T257252

A quick update here: today, @MNeisler and I decided to de-prioritize work on T257252 and we have not yet set a date for when we will revisit it.

Reason: while the information T257252 is asking for would help us determine whose experiences (read: Senior Contributors' and Junior Contributors'? Just Senior Contributors'? etc.) is represented in the metrics reported in this ticket, that information is not going to inform a decision we currently need to make. As such, we are not going to work on it at this time.

@ppelberg

I finished making the following adjustments/additions to the adoption metrics report:

"Total Number of Users that Explicitly turned on or turned off the Beta Feature"
ADD a metric that helps us know how many distinct people explicitly turned off the DiscussionTools Beta Feature at any point after making at least one edit with the Reply tool?

Results Summary: A total of 67 or 20.4% of all reply tool users explicitly turned off the feature after making at least 1 edit with the reply tool and did not turn it back on again. The majority of users (79.6%) that made at least 1 edit with the reply tool did not turn it off.

MODIFY the metrics that depend on PrefUpdate sequence such that they help us understand the following:
How many people explicitly turned the tool on once?
How many people explicitly turned the tool off once?

Results Summary: [1, 2]

Number of distinct users [3]Number of distinct users that turned the tool on or off once [4]
turned on2,1321,895
turned off4,5254,427

[1] Time Period: 31 March through 30 June 2020.
[2] Prefupdate data quality issues: Note we are missing pref update data from 2020-05-11 through 2020-06-05 (T253151) and there are some duplicate events (T218835) being recorded.
[3] Includes users that turned the feature on and off multiple times
[4] Excludes users that turned the feature on and off multiple times

Please see the final report and repo for further details and per wiki breakdown.

@ppelberg

I finished making the following adjustments/additions to the adoption metrics report:

Great – thank you, Megan.

RE: "A total of 67 or 20.4% of all reply tool users explicitly turned off the feature after making at least 1 edit with the reply tool and did not turn it back on..." [1]

  • @MNeisler do we know what proportion of those "20.4% of people" turned on the tool themselves or whether they had it turned on for them by way of having the Automatically enable most beta features preference activated/enabled?

I ask the above thinking the stories below have significantly different meanings:

  • Story A: The majority of the people who made one edit with the Reply Tool and subsequently turned it off were people who had the Reply Tool turned on for them by way of having the Automatically enable most beta features preference activated/enabled.
  • Story B: The majority of the people who made an edit with the Reply Tool and subsequently turned it off were people who explicitly turned on the feature in Beta Features.

  1. https://nbviewer.jupyter.org/github/wikimedia-research/Discussion-tools-analysis-2020/blob/master/Replying-Tool-Adoption-Metrics.ipynb#How-many...
  • @MNeisler do we know what proportion of those "20.4% of people" turned on the tool themselves or whether they had it turned on for them by way of having the Automatically enable most beta features preference activated/enabled?

I don't have the data available at the moment but this can be determined by finding the user's first preference change for that feature recorded in PrefUpdate.
If the user's first recorded action in PrefUpdate is that they turned off the feature then we can assume that it was turned on for them by enabling the Automatically enable most beta features. This would require an adjustment of the query but it shouldn't be too complicated if we'd like to get that additional detail.

  • @MNeisler do we know what proportion of those "20.4% of people" turned on the tool themselves or whether they had it turned on for them by way of having the Automatically enable most beta features preference activated/enabled?

I don't have the data available at the moment but this can be determined by finding the user's first preference change for that feature recorded in PrefUpdate.
If the user's first recorded action in PrefUpdate is that they turned off the feature then we can assume that it was turned on for them by enabling the Automatically enable most beta features.

Understood.

This would require an adjustment of the query but it shouldn't be too complicated if we'd like to get that additional detail.

Yes, can we please? After doing so, this task will be finished.

Here is a breakdown of the reply tool users that turned off the feature after making at least 1 edit. I also modified the query to remove users that turned it on and off multiple times which decreased the overall percentage of reply tool users that turned off the feature after making at least 1 edit to 17.38%.

Overall
Auto Enrolled Reply Tool Users: 3.66%
Explicitly Enrolled Reply Tool Users: 13.72%

By Partner Wiki

wikiAuto Enrolled Reply Tool UsersExplicitly Enrolled Reply Tool Users
arwiki3.62%18.07%
frwiki4.58%11.76%
huwiki2.33%6.98%
nlwiki2.04%18.37%

On a per wiki basis, the highest percent of users that turned the feature off after turning it on was on Arabic Wikipedia (21.69%) and the lowest on Hungarian Wikipedia (9.3%).

According to the data above, the majority of users that turned off the features after making at least 1 edit were those that explicitly turned on the feature in Beta Features. Let me know if you have any questions or any further detail is needed.

Updated notebook

Here is a breakdown of the reply tool users that turned off the feature after making at least 1 edit. I also modified the query to remove users that turned it on and off multiple times which decreased the overall percentage of reply tool users that turned off the feature after making at least 1 edit to 17.38%.

Thank you, Megan and good call to remove people who turned the feature on/off multiple times.

...Let me know if you have any questions or any further detail is needed.

This looks great. I'm going to resolve this.