Page MenuHomePhabricator

Investigate TwoColConflict opt-out metrics, explain whether there's a trend
Open, Needs TriagePublic3 Estimated Story Points

Assigned To
None
Authored By
awight
Jul 9 2020, 2:08 PM
Referenced Files
F32054361: image.png
Aug 11 2020, 5:02 PM
F32053571: image.png
Aug 11 2020, 5:02 PM
F32054007: image.png
Aug 11 2020, 5:02 PM
F32054216: image.png
Aug 11 2020, 5:02 PM
F32053536: image.png
Aug 11 2020, 5:02 PM
F31923381: image.png
Jul 10 2020, 12:47 PM
F31922117: image.png
Jul 9 2020, 2:08 PM

Description

At first glance, it looks like this graph shows user rejection of our feature, immediately after first use:

https://grafana.wikimedia.org/d/000000346/mediawiki-twocolconflict?orgId=1&from=1578578404972&to=1594303204973

image.png (794×1 px, 106 KB)

However, the migration we engineered naturally converts the beta preference opt-out over to the new preference and it might be causing most of the effect seen here.

Review the daily aggregation script, modify to calculate another field, the total of both beta and new opt-outs. This should make the real trend more obvious.

Acceptance criteria:

  • New metrics will include the sum of beta- and new-preference opt-out.
    • Can stop recording the old metric, we'll never migrate back to it.
  • Grafana shows new metric.
    • Can include the old metric when plotting historical metrics.

In a follow-up task, find the number of unique users of the feature per day and compare to the disablement count. It's unclear how to interpret this ratio, hopefully it approximates user satisfaction.

Event Timeline

awight set the point value for this task to 2.Jul 9 2020, 2:08 PM

Change 611279 had a related patch set uploaded (by Awight; owner: Awight):
[analytics/wmde/scripts@master] Change metric for TwoColConflict disables

https://gerrit.wikimedia.org/r/611279

I still don't know how to analyze the historical data made using the bad metric. A really crude approach is to compare the derivative after changing metrics to the old derivative, and just... anything above zero is a real opt-out.

The indicator which looked concerning was the share of successful resolutions where TwoColConflict was used. However, in the 8 months before small default deployment (2020-03-25), the average share was 18% and since deployment, 17%. A small increase would have been nicer but I think we're well within a margin of random fluctuations. There seem to be some short-term trends in the data, like an increase in adoption early this year ending at roughly the time of the small default deployment and Kurier coverage, but neither seems significant in the long view.

image.png (780×1 px, 110 KB)

awight changed the point value for this task from 2 to 3.

Increasing the estimate because I'd overlooked updating the graph.

I just learned that PrefUpdate has not been under a 90-day purge regime (it will be soon, or will at least be sanitized further). I took advantage of this situation to make a dump of TwoColConflict preferences:

create table awight.twocol_prefupdates as
select * from event_sanitized.prefupdate
where event.property in ('twocolconflict', 'twocolconflict-enabled') and year = 2020;

We need to drop this data after aggregating to get the info we need.

Another twist: we fool the WikimediaEvents PrefUpdateInstrumentation by hooking into UserLoadOptions. Because of this, the PrefUpdate hook is unable to tell when we've changed a value, and nothing in the history seems to be helpful for us. Deleting my temporary table now.

Change 611279 merged by jenkins-bot:
[analytics/wmde/scripts@master] Change metric for TwoColConflict disables

https://gerrit.wikimedia.org/r/611279

Remaining work is to check on the new metric after the nightly job runs, and wire it into Grafana.

Lena_WMDE changed the point value for this task from 3 to 1.Jul 23 2020, 10:13 AM

The new metric isn't wired into Grafana yet.

This comment was removed by awight.

I didn't realize, but we need to merge the patch to the production branch in order to deploy.

@Addshore @Ladsgroup When either of you have time, please cherry-pick https://gerrit.wikimedia.org/r/c/analytics/wmde/scripts/+/611279 to production, or add me to the analytics-wmde gerrit group. Thanks in advance!

awight updated the task description. (Show Details)
awight changed the point value for this task from 1 to 3.
awight moved this task from Sprint Backlog to Doing on the WMDE-QWERTY-Sprint-2020-07-22 board.

Digging around in the existing data heap while we're blocked by lack of metrics...

There has been a pretty steady rate of new beta opt-ins since the feature was launched, averaging about +150 per day—but don't believe it yet (see below):

image.png (252×864 px, 31 KB)

image.png (257×863 px, 17 KB)

We have a reason to be suspicious of this metric because it includes some number of people who have "Automatically enable all new beta features" enabled. For these people, their first preferences page visit to save *any* preference will result in a new opt-in. A count of only the intentional opt-ins would show logarithmic decay to zero as the pool of interested editors who have not enabled the feature yet grows smaller. Instead, there's a strong logarithmic component for the first couple of months, plus a constant component which is well above zero and steady over the long term:

image.png (256×844 px, 24 KB)

Comparing all beta feature counts, they show a mix of similar artifacts,

image.png (287×1 px, 99 KB)

IMO this strongly confirms the automatic opt-in theory, and the middle cluster is probably the set of currently available beta features on most wikis. Filtering to just this cluster,
image.png (301×921 px, 50 KB)

The dashed line is the TwoColConflict opt-ins. What I think I'm seeing is that our feature is getting the least intentional opt-in, possibly zero, compared to the other features. This sorta makes sense since our feature is so niche.

tl;dr. I haven't found a way to tease apart the intentional opt-ins from the automatic opt-ins, but I can already say that the metric we have is not useful because the signal has been drowned out by the noise, and I can guess that no metrics will be useful for counting opt-*in* because of the small trickle relative to other factors.

Change 619378 had a related patch set uploaded (by Ladsgroup; owner: Awight):
[analytics/wmde/scripts@production] Change metric for TwoColConflict disables

https://gerrit.wikimedia.org/r/619378

Change 619378 merged by jenkins-bot:
[analytics/wmde/scripts@production] Change metric for TwoColConflict disables

https://gerrit.wikimedia.org/r/619378

There could be a serious problem with our preference migration. I tried to validate the strange numbers coming from the new code (600k users opted-out), and they seem correct. On enwiki alone,

SELECT count(user_name)
from user_properties
join user ON up_user = user_id
WHERE up_property = 'twocolconflict'
AND up_value = 0;

-> 257946

Totals from other wikis,
dewiki -> 25356
fawiki -> 6714
trwiki -> 5465

While commenting on https://gerrit.wikimedia.org/r/c/mediawiki/extensions/TwoColConflict/+/620014 I realized what the large, steadily raising green curve in the first screenshot means: This is the amount of users[1] that log in to their Wikimedia project for the first time after we changed TwoColConflict to not be in Beta any more. Every time a user[1] logs in the number for this day increases by one. It slows down after a while and will ultimately flatten because more and more of the active users already made at least one login in the meantime.

[1] It's not all users, but only users that changed something in their settings since TwoColConflict is in Beta.

This is the amount of users[1] that log in to their Wikimedia project for the first time after we changed TwoColConflict to not be in Beta any more. Every time a user[1] logs in the number for this day increases by one.

I think that's correct. It's not a graph of actual opt-outs.

Removing task assignee due to inactivity as this open task has been assigned for more than two years. See the email sent to the task assignee on August 22nd, 2022.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome!
If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!