Page MenuHomePhabricator

Analyze use of mute preferences
Closed, ResolvedPublic

Description

The Anti-Harassment Tools Team is interested in statistics on usage of mute preferences in preparation for determining their Q4 plans. Per discussions with the team, we're interested in monthly counts of the following metrics:

  • Users turning off email
  • Users turning off email from brand new users
  • Users who add their first user to the email block list
  • Users who add their first user to the Echo notification block list

We also want to get snapshot statistics of the following measurements:

  • The extent to which users have the same names on both block lists.
  • How many names are on these lists (either as an average, a graph of the distribution, or some description of the distribution).

We'd like to know this for the following Wikipedias: English, French, German, Spanish, Russian, Italian, Dutch, Japanese, Chinese, and Portuguese.

Event Timeline

During the analysis of usage of the email and Echo notification block lists, an issue with spikes in usage has come up. One key question about analyzing historical data of features like these is when these features were available, and in what capacity. For example, the email block list might've been available as a beta feature for a while. Secondly, there might've been announcements of the feature being available that would affect usage. Having that kind of information available aids analysis as it removes guesswork around why certain patterns emerge, so I'm documenting that here for future reference in other analysis tasks.

@TBolliger : Thanks again for providing those tasks and dates, much appreciated, and I'll make sure to incorporate those as I continue this work.

I'm about to update the task description to reflect recent changes in needs for this work.

I have completed the snapshot analyses as well as some data cleanup to remove the graph spike in email usage in July 2017. Will do a little bit more work to look into the Echo notification usage spike in October 2017 but I suspect that could be initial adoption and that we might prefer to omit the first few months and start graph on January 2018.

Below a table overview of the extent that users block the same users. N set is the number of users who have at least one other user in either of the lists. N identical is the number of users who have the same set of users in both block lists, and % is the percentage of the latter from the former.

WikiN setN identical%
English1,1117151.3%
French16410.6%
German50030.6%
Spanish20921.0%
Russian14242.8%
Italian7100.0%
Dutch5411.9%
Japanese15142.6%
Chinese17431.7%
Portuguese9900.0%

Whelp, it looks like we won't want to merge the lists! (Unless — and this would require some research from Claudia — people are only aware of one list. We don't want to conflate usage with expectation. Regardless for now we can delay on merging the lists.)

Whelp, it looks like we won't want to merge the lists!

I'm not sure why that would change anything. Just because a user fills out one list, doesn't mean they did not want the user to also be on the other.

Whelp, it looks like we won't want to merge the lists!

I'm not sure why that would change anything. Just because a user fills out one list, doesn't mean they did not want the user to also be on the other.

I think merging the lists is a potential next step for these products, but it would need more research to understand why the lists do not have the same content: do users intentionally want the lists to be separate? are they aware of both lists? are they aware but find one list useless?

I think merging the lists is a potential next step for these products, but it would need more research to understand why the lists do not have the same content: do users intentionally want the lists to be separate? are they aware of both lists? are they aware but find one list useless?

From my user experience perspective, the interaction is wrong, so the reason why is irrelevant since the implementation was wrong in the first place.

My assumption would be, that if you are being bothered by a user on one medium, you are going to go add them to that mute list. There's no need to go add them to the other list unless they start bothering you there as well. Also, if you have emails disabled, there's no need to add anyone at all to that list.

From my user experience perspective, the interaction is wrong, so the reason why is irrelevant since the implementation was wrong in the first place.

My assumption would be, that if you are being bothered by a user on one medium, you are going to go add them to that mute list. There's no need to go add them to the other list unless they start bothering you there as well. Also, if you have emails disabled, there's no need to add anyone at all to that list.

I would agree with this, especially about having email disabled entirely — there's no need to have a mute list if it's completed disabled.

Please allow me to rephrase my earlier statement: Based on that snapshot data alone, we should not merge the lists, but we should gather more information because our intuition tells us that merging the lists would be a usability improvement for the target audience of these protection features.

I went back and dug into the data around the spike in Echo blacklist usage in October 2017 a bit more. From what I can tell, there doesn't seem to be a reason to suspect the data is invalid. Since the spike does make interpretation of the remainder of the data very difficult, and we're in this case mainly interested in development over time with a higher weight to more recent developments, I changed the start date of the Echo blacklist usage graphs to 2018-01-01 as that removes the spike.

I've uploaded the analysis notebook and the graphs to this GitHub repo.