Page MenuHomePhabricator

Newcomer tasks: variant group and event alignment issues
Closed, ResolvedPublic

Description

In looking at some of our EventLogging data in conjunction with data from our variant tests (T238888), I see users who have event histories that do not match with their experiment group, or are otherwise inconsistent.

Two examples will be sent separately. Both of these users have se-task-impression events but no se-activation events, which makes it seem like they would be in the experiment group that has the module pre-initiated, but they are not (according to @nettrom_WMF's add_variant_status python function).

So the question is how common these sorts of things happen and whether they are common enough to worry about and fix.

Event Timeline

LGoto triaged this task as High priority.Feb 6 2020, 9:10 PM

I dug into the data we have for the two users that Marshall sent over information about. For one of the users, I could identify when they activated the Newcomer Tasks module (by comparing consecutive entries in the HomepageVisit schema). For those two visits, we had no data at all logged in HomepageModule. I noticed that the user was on mobile at the time, but don't yet know if that's a pattern.

For the second user, I identified that their first visit to the Homepage had the Newcomer Tasks module activated, meaning that they should have a corresponding entry in their user preferences to identify them as being a member of the pre-activated user group. However, when looking at their user preferences I found no Growth Experiment preferences at all.

I chose to investigate the second type of issue first. For that, I grabbed data on all first visits to the Homepage using the HomepageVisit schema, only counting those where the state of the Newcomer Tasks module was set to "activated", and excluding known test accounts. Then, I grabbed the user preferences of those users from the user_properties table on the corresponding wiki. We can then aggregate across wiki, and in the table below if the "Variant" is "None" it means the user preferences don't contain the expected growthexperiments-homepage-suggestededits-preactivated preference:

WikiVariantN usersPercent
ArabicNone260.5
ArabicPre-activated479499.5
CzechNone00.0
CzechPre-activated765100.0
KoreanNone10.1
KoreanPre-activated73599.9
VietnameseNone40.4
VietnamesePre-activated99299.6

While the number of affected users is low, it's also non-zero. I went through the user_properties table for all the affected users, and many of them have no Growth preferences set at all. Some have a few, e.g. the homepage enabled and a mentor ID.

I've not yet figured out an easy way to check the counter to this, whether users that are in the other variant group saw the Homepage with the Newcomer Tasks module deactivated and have preferences that correspond to that. One conclusion to draw from this is that regardless of what kind of experiment we're running we want to store explicit group assignments somewhere so we can identify faulty instrumentation at any point.

Next, I'll go dig into the first case listed above: identify the session where a user initialized the Newcomer Tasks module and correlate that with the HomepageModule schema.

@nettrom_WMF -- thanks for the update here. In general, I think that if we looking at 98% or 99% accuracy on our measurement, that we should let the rest of the cases slide. Do you think that is the right policy, especially when dealing with client-side logging that we're using? And that therefore we shouldn't worry about the issue that you investigated on this task thus far?

@MMiller_WMF : my TL;DR of this is that ~0.5% of users who visit the Homepage do not have their user preferences set correctly to reflect the state of their group assignment in Homepage experiments. I find that to be a concern for two reasons: 1) we've assumed that these assignments are reliable; and 2) it's unclear at this point how this will work when we move to larger wikis.

When it comes to our use of these measurements for general reporting, I find this to not be a concern. It's when we want to start drawing inferences based on these I start to get worried.

I've looked into the patterns in the first case, that a user activates the Newcomer Tasks module but we don't have a action = "se-activate" event recorded in HomepageModule. To do that, I found all users (excluding known test accounts) who have two consecutive visits in HomepageVisit that showed the Newcomer Tasks module going from "inactive" to "active". Then I joined the first of those sessions with HomepageModule in order to learn if we had any client-side data at all about that visit (which is where they would activate the module). Split by wiki and desktop/mobile, here's whether we have data for those:

WikiOn mobile?Module data?N activationsPercent
ArabicFalseFalse4413.2
ArabicFalseTrue29086.8
ArabicTrueFalse366.3
ArabicTrueTrue53993.7
CzechFalseFalse98.3
CzechFalseTrue9991.7
CzechTrueFalse11.4
CzechTrueTrue6998.6
KoreanFalseFalse611.1
KoreanFalseTrue4888.9
KoreanTrueFalse11.6
KoreanTrueTrue6098.4
VietnameseFalseFalse76.3
VietnameseFalseTrue10493.7
VietnameseTrueFalse1014.5
VietnameseTrueTrue5985.5

For Arabic, Korean, and Vietnamese, it looks like about 10% of these transitions haven't been recorded in HomepageModule. Czech is lower (6%). As we can see, there's some variation between desktop and mobile with desktop users being more likely to not have client-side EventLogging recorded. However, Vietnamese is opposite of this trend.

One unknown factor in the above analysis is the overall blocking of EventLogging. This also came up in T241871, so I found this to be a good opportunity to dig into that a little bit. I approached it from two angles: 1) on a session basis, what proportion of Homepage visits do not have data in HomepageModule?; and 2) on a per-user basis, what percentage of visits to we have data for?

For the first of these, we can easily split between mobile and desktop because each session occurs on a specific platform. For the second, splitting becomes much more difficult, so there I've instead chosen to focus on three categories of users, whether we have data for no, some, or all their sessions. As before, known test accounts are withheld from analysis.

On a session basis, the proportions of visits for which we have no data in HomepageModule look as follows:

WikiOn mobile?Module data?N sessionsPercent
ArabicFalseFalse4,69720.7
ArabicFalseTrue18,00379.3
ArabicTrueFalse5,48516.0
ArabicTrueTrue28,85884.0
CzechFalseFalse1,99118.5
CzechFalseTrue8,76881.5
CzechTrueFalse3369.3
CzechTrueTrue3,27790.7
KoreanFalseFalse2,24532.0
KoreanFalseTrue4,76768.0
KoreanTrueFalse1,91917.7
KoreanTrueTrue8,90982.3
VietnameseFalseFalse1,31714.5
VietnameseFalseTrue7,74885.5
VietnameseTrueFalse93224.5
VietnameseTrueTrue2,87175.5

We can see that across all wikis, a fairly large proportion of sessions have no data in HomepageModule. Typically, this is more common on desktop than on mobile, but on Vietnamese this is different with mobile having a larger proportion.

When we compare the first and second table, we see a lower proportion of activations not being recorded in HomepageModule relative to what we'd expect from sessions overall. One caveat with using sessions is that they'll be biased towards highly active users. We might expect users who are very active to also be more privacy conscious, meaning they'll be overrepresented in Table 2.

I therefore also looked at all visits on a per-user basis and recorded what percentage of their visits we could match up with any data from HomepageModule. Then, I calculated percentiles on a per-wiki basis and noticed that they fall into three categories: those with data for none of their visits, some of their visits, and all of their visits. These look as follows across our four wikis:

Wiki% with no matches% with all matches% with some matches
Arabic107515
Czech127810
Korean127315
Vietnamese117910

When we compare Table 3 and Table 1, we see that the proportions of activations without any data in HomepageModule is roughly what we'd expect, except for Czech where it's lower.

Summary: We can find visits in HomepageVisit that indicate a user has activated the Newcomer Tasks module. For ~10% of these (6% in Czech) we don't have activation data in HomepageModule. This is roughly what we'd expect based on what proportion of users have no client-side EventLogging data captured. Homepage visits with no matching client-side data is fairly common, and appears slightly biased towards users who are more active.

T223931: Switch mw.user.sessionId back to session-cookie persistence doesn't affect this, right? That seems like a good candidate for unexpected logging behavior, but I don't think we have used plain mw.user.sessionId other than in some help panel log events.

Thanks for this work, @nettrom_WMF. Taking this altogether, I don't think we have any particular changes to make except to remember that our results that we get via client-side EventLogging are a little biased toward the kind of users who permit client-side EventLogging.