This task involves the work with making sure we are logging data in such a way that we will be able to analyze the impact of the VE as default A/B test.
= Test metrics =
- **Edit completion rate**
-- Do contributors in one test group complete edits at a higher rate than contributors in the other test group?
- **Total number of completed edits**
-- Do contributors in one test group complete more edits than contributors in the other test group?
- **Time to save an edit**
-- Do contributors in one test group complete their edits more quickly than contributors in the other test group? This is a metric we would need to look at alongside other measures, like the size of the edits being made.
- **Edit size**
-- Do contributors in one bucket make larger edits than contributors in the other bucket?
- **Editor retention**
-- Are contributors in one test group more likely to come back to edit again than contributors in the other test group?
- **Edit quality**
-- Are contributors’ edits in one test group more likely to be reverted than contributors’ edits in another test group?
- **Editing interface switching**
-- Are contributors in one test group switching between editing interfaces more often than contributors in another test group?
-- See T221191
= Checks =
This section contains the //latest// results of each data check (so as problems are solved, the results will change to reflect that).
== Events by platform and registration status ==
Since 3 June, when we applied the big fix (T221197#5305239), the number of phone events have been recovering to pre-bug levels (note that 4 June data is partial).
**Verdict: all good!**
```lang=sql,lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
sum(cast(event.platform = "desktop" and event.user_id != 0 as int)) as registered_desktop,
sum(cast(event.platform = "desktop" and event.user_id = 0 as int)) as anonymous_desktop,
sum(cast(event.platform = "phone" and event.user_id != 0 as int)) as registered_phone,
sum(cast(event.platform = "phone" and event.user_id = 0 as int)) as anonymous_phone
from event.editattemptstep
where
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd")
order by date
limit 100
```
| date | registered_desktop | anonymous_desktop | registered_phone | anonymous_phone
| ----- | ----- | ----- | ----- | -----
| 2019-06-24 | 108487 | 234554 | 33555 | 299146
| 2019-06-25 | 111791 | 222482 | 26893 | 227859
| 2019-06-26 | 107107 | 232324 | 55 | 1410
| 2019-06-27 | 101278 | 212925 | 11 | 480
| 2019-06-28 | 100611 | 211608 | 0 | 7600
| 2019-06-29 | 96483 | 179354 | 0 | 62548
| 2019-06-30 | 104681 | 189258 | 0 | 91495
| 2019-07-01 | 114615 | 227053 | 0 | 97624
| 2019-07-02 | 109774 | 227518 | 0 | 106471
| 2019-07-03 | 108142 | 226114 | 1268 | 115234
| 2019-07-04 | 68504 | 151804 | 27042 | 246107
== Inits per bucket in test population ==
We are logging a reasonable amount of inits in each bucket among the expected experiment population (i.e. on the selected wikis and with fewer than 100 edits). After data from registered edits returned, we are seeing a small number of users not bucketed, which makes sense since some users in the population will stay out of the buckets because they already had the sticky preference set.
With oversampling, we see a lot more events in the visual editing bucket because only mobile VE data is oversampled. Without oversampling, the number of inits in each buckets in pretty close, as expected.
**Verdict: we should oversample mobile wikitext events at our test wikis.**
**With oversampling**:
```lang=sql,lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
event.bucket as bucket,
count(*) as events
from event.editattemptstep
where
event.platform = "phone" and
event.action = "init" and
event.user_editcount < 100 and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd"), event.bucket
```
| date | no bucket | default-source | default-visual
| ----- | ----- | ----- | -----
| 2019-06-24 | 6422 | 0 | 0
| 2019-06-25 | 5107 | 0 | 0
| 2019-06-26 | 23 | 0 | 0
| 2019-06-27 | 10 | 0 | 0
| 2019-06-28 | 2 | 220 | 2374
| 2019-06-29 | 1 | 1785 | 20300
| 2019-06-30 | 1 | 2598 | 29779
| 2019-07-01 | 0 | 2788 | 32164
| 2019-07-02 | 0 | 2932 | 35256
| 2019-07-03 | 0 | 3034 | 35754
| 2019-07-04 | 14 | 2092 | 22830
**Without oversampling**:
```lang=sql,lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
event.bucket as bucket,
count(*) as events
from event.editattemptstep
where
event.platform = "phone" and
event.action = "init" and
event.user_editcount < 100 and
not event.is_oversample and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd"), event.bucket
```
| date | no bucket | default-source | default-visual
| ----- | ----- | ----- | -----
| 2019-06-24 | 5252 | 0 | 0
| 2019-06-25 | 4111 | 0 | 0
| 2019-06-26 | 21 | 0 | 0
| 2019-06-27 | 6 | 0 | 0
| 2019-06-28 | 2 | 167 | 152
| 2019-06-29 | 0 | 1431 | 1309
| 2019-06-30 | 1 | 2134 | 1858
| 2019-07-01 | 0 | 2303 | 2132
| 2019-07-02 | 0 | 2489 | 2200
| 2019-07-03 | 0 | 2527 | 2259
| 2019-07-04 | 15 | 1763 | 1579
== Inits per bucket among experienced editors ==
Editors at test wikis with 100 edits or more should //not// be bucketed. Since the big fix, this seems to be happening (if the numbers seem low, consider that it's a small number of wikis and that these users are probably using wikitext, which is 1/16 sampled).
The single such editor on 4 July who fell into the `default-source` bucket also seems right, since rarely editors might actually move into this edit count range //after// being bucketed.
**Verdict: all good!**
```lang=sql,lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
event.bucket as bucket,
count(*) as events
from event.editattemptstep
where
event.platform = "phone" and
event.action = "init" and
event.user_editcount >= 100 and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd"), event.bucket
```
| date | default-source | no bucket
| ----- | ----- | -----
| 2019-06-24 | 0 | 129
| 2019-06-25 | 0 | 121
| 2019-06-26 | 0 | 1
| 2019-06-27 | 0 | 0
| 2019-06-28 | 0 | 0
| 2019-06-29 | 0 | 0
| 2019-06-30 | 0 | 0
| 2019-07-01 | 0 | 0
| 2019-07-02 | 0 | 0
| 2019-07-03 | 0 | 5
| 2019-07-04 | 1 | 113
== Saves per bucket ==
In our test population, without oversampling, all our saves have a bucket and the numbers are relatively even.
**Verdict: all good!**
```lang=sql, lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
event.bucket as bucket,
count(*) as events
from event.editattemptstep
where
event.platform = "phone" and
event.action = "saveSuccess" and
event.user_editcount < 100 and
not event.is_oversample and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd"), event.bucket
```
| date | no bucket | default-source | default-visual
| ----- | ----- | ----- | -----
| 2019-06-24 | 143 | 0 | 0
| 2019-06-25 | 107 | 0 | 0
| 2019-06-28 | 0 | 4 | 3
| 2019-06-29 | 0 | 56 | 41
| 2019-06-30 | 0 | 58 | 39
| 2019-07-01 | 0 | 50 | 43
| 2019-07-02 | 0 | 72 | 44
| 2019-07-03 | 0 | 69 | 55
| 2019-07-04 | 0 | 49 | 41
== Buckets vs. actual editors used ==
Within each bucket, the vast majority edits are coming from the default editor. This suggests the default is working correctly, but users still have the ability to switch.
**Verdict: all good!**
```lang=sql, lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
sum(cast(
array(event.bucket, event.editor_interface) in (
array("default-source", "wikitext"),
array("default-visual", "visualeditor")
)
as int)) / count(*) as default_editor_inits
from event.editattemptstep
where
event.platform = "phone" and
event.action = "init" and
event.bucket in ("default-source", "default-visual") and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day >= 28) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd")
```
| date | default_editor_inits
| ----- | -----
| 2019-06-28 | 97.6%
| 2019-06-29 | 98.1%
| 2019-06-30 | 98.4%
| 2019-07-01 | 98.4%
| 2019-07-02 | 98.6%
| 2019-07-03 | 98.5%
| 2019-07-04 | 97.9%
== Buckets per user ==
So far during the test, we haven't seen any users switching to a different bucket.
**Verdict: all good!**
```lang=sql, lines=5
select
user_type,
sum(cast(distinct_buckets > 1 as int)) / count(*) as multiple_buckets
from (
select
if(event.user_id = 0, "anonymous", "registered") as user_type,
count(distinct event.bucket) as distinct_buckets
from event.editattemptstep
where
event.platform = "phone" and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by
wiki,
if(event.user_id = 0, event.anonymous_user_token, event.user_id),
if(event.user_id = 0, "anonymous", "registered")
) buckets_per_user
group by user_type
```
| user_type | multiple_buckets
| ----- | -----
| anonymous | 0.0%
| registered | 0.0%
== Anonymous user tokens ==
All of our anonymous users have tokens. On a couple of days, 0.1% are missing the token, but that's too small to bother about. It may be something to do with users who don't allow cookies at all.
**Verdict: all good!**
```lang=sql, lines=5
select
date_format(dt, "yyyy-MM-dd") as date,
sum(cast(event.anonymous_user_token is not null as int)) / count(*) as anonymous_users_with_tokens
from event.editattemptstep
where
event.platform = "phone" and
event.user_id = 0 and
wiki in (
'azwiki', 'bgwiki', 'zh_yuewiki', 'cawiki', 'hrwiki',
'dawiki', 'etwiki', 'fiwiki', 'elwiki', 'huwiki',
'mswiki', 'mlwiki', 'nowiki', 'ptwiki', 'rowiki',
'srwiki', 'svwiki', 'tawiki', 'thwiki', 'urwiki'
) and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by date_format(dt, "yyyy-MM-dd")
```
| date | anonymous_users_with_tokens
| ----- | -----
| 2019-06-24 | 0.0%
| 2019-06-25 | 0.0%
| 2019-06-26 | 0.0%
| 2019-06-27 | 0.0%
| 2019-06-28 | 99.9%
| 2019-06-29 | 100.0%
| 2019-06-30 | 100.0%
| 2019-07-01 | 100.0%
| 2019-07-02 | 100.0%
| 2019-07-03 | 100.0%
| 2019-07-04 | 99.9%
== New revision IDs in saveSuccess events ==
As of 4 July, we are still not logging the new revision ID outside the desktop wikitext editor, since the patch hasn't ridden the train yet.
**Verdict: re-run query after next week's train to make sure the patch has the intended effect.**
```lang=sql, lines=5
with saves as (
select
dt,
event.editing_session_id as attempt_id,
event.revision_id as revision_id,
event.platform as platform,
event.editor_interface as editor
from event.editattemptstep
where
event.action = "saveSuccess" and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
) and
-- Remove Flow and other non-standard edits
event.integration = "page"
),
pre_saves as (
select
event.editing_session_id as attempt_id,
max(event.revision_id) as max_revision_id
from event.editattemptstep
where
event.action != "saveSuccess" and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
) and
-- Remove Flow and other non-standard edits
event.integration = "page"
group by event.editing_session_id
)
select
date_format(dt, "yyyy-MM-dd") as date,
platform,
editor,
sum(cast(saves.revision_id > pre_saves.max_revision_id as int)) / count(*) as save_has_greater_revision_id
from saves
left join pre_saves
on saves.attempt_id = pre_saves.attempt_id
group by
date_format(dt, "yyyy-MM-dd"),
platform,
editor
```
| date | desktop visualeditor | desktop wikitext | desktop wikitext-2017 | phone visualeditor | phone wikitext
| ----- | ----- | ----- | ----- | ----- | -----
| 2019-06-24 | 0.0% | 96.4% | 0.0% | 0.0% | 0.0%
| 2019-06-25 | 0.0% | 95.6% | 0.0% | 0.0% | 0.0%
| 2019-06-26 | 0.0% | 95.9% | 0.0% | 0.0% | 0.0%
| 2019-06-27 | 0.0% | 95.9% | 0.0% | 0.0% | 0.0%
| 2019-06-28 | 0.0% | 96.1% | 0.0% | 0.0% | 0.0%
| 2019-06-29 | 0.0% | 96.7% | 0.0% | 0.0% | 0.0%
| 2019-06-30 | 0.0% | 96.8% | 0.0% | 0.0% | 0.0%
| 2019-07-01 | 0.0% | 96.2% | 0.0% | 0.0% | 0.0%
| 2019-07-02 | 0.0% | 96.1% | 0.0% | 0.0% | 0.0%
| 2019-07-03 | 0.0% | 96.4% | 0.0% | 0.0% | 0.0%
| 2019-07-04 | 0.0% | 96.3% | 0.0% | 0.0% | 0.0%
== Editor switching ==
We are now seeing editor switch data coming in for the mobile editors. The tracking patch for desktop has been merged and should ride the July 9 train.
**Verdict: re-run query after desktop patch rides the train.**
```lang=sql, lines=5
select
date,
editor,
sum(cast(switches >= 1 as int)) / count(*) as sessions_with_switches
from (
select
date_format(dt, "yyyy-MM-dd") as date,
event.editing_session_id as editingsessionid,
concat(event.platform, " ", event.editor_interface) as editor
from event.editattemptstep
where
event.action = "ready" and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
) readies
left join (
select
event.editingsessionid as editingsessionid,
count(*) as switches
from event.visualeditorfeatureuse
where
event.feature = "editor-switch" and
year = 2019 and (
(month = 6 and day > 23) or
(month = 7)
)
group by event.editingsessionid
) switches
on readies.editingsessionid = switches.editingsessionid
group by date, editor
```
| date | desktop visualeditor | desktop wikitext | desktop wikitext-2017 | phone visualeditor | phone wikitext
| ----- | ----- | ----- | ----- | ----- | -----
| 2019-06-24 | 0.0% | 0.0% | 0.0% | 0.0% | 0.0%
| 2019-06-25 | 0.0% | 0.0% | 0.0% | 0.0% | 0.0%
| 2019-06-26 | 0.0% | 0.0% | 0.0% | 0.0% | 0.0%
| 2019-06-27 | 0.0% | 0.0% | 0.0% | 0.0% | 0.0%
| 2019-06-28 | 0.0% | 0.0% | 0.0% | 1.5% | 0.0%
| 2019-06-29 | 0.0% | 0.0% | 0.0% | 1.1% | 0.9%
| 2019-06-30 | 0.0% | 0.0% | 0.0% | 0.9% | 0.6%
| 2019-07-01 | 0.0% | 0.0% | 0.0% | 1.1% | 0.8%
| 2019-07-02 | 0.0% | 0.0% | 0.0% | 1.2% | 0.8%
| 2019-07-03 | 0.0% | 0.0% | 0.0% | 1.1% | 0.7%
| 2019-07-04 | 0.0% | 0.0% | 0.0% | 4.5% | 0.8%
| 2019-07-05 | 0.0% | 0.0% | 0.0% | 4.5% | 0.8%