Page MenuHomePhabricator

Turn on page issues A/B test for Latvian Wikipedia, and conduct data checks
Closed, ResolvedPublic2 Estimated Story Points

Description

Background

We'd like to turn on page issues on Latvian wikipedia to ensure that our instrumentations works as expected and to uncover any unknown bugs

Acceptance criteria

  • Configure "MinervaABSamplingRate" (enable the page issues AB test) at the sample rate of 100%

Sample config:

"wgMinervaABSamplingRate": [
  'default' => 0,
  'lvwiki' => 1,
]
  • Inspect the recorded data and do various plausibility checks

During deploy

  • Make sure the sampling rate is acceptable by monitoring events coming in. Stop deploy if too high.

Developer notes

Fixing bugs is not within the scope of this task. IF there are any production-critical bugs, the feature should be turned off and a new deploy/task will be needed.

Related Objects

Event Timeline

ovasileva created this task.
ovasileva removed subscribers: Aklapper, phuedx, Dereckson and 3 others.
Jdlrobson updated the task description. (Show Details)
Jdlrobson updated the task description. (Show Details)

x is to be defined by @Tbayer ?

Jdlrobson removed the point value for this task.Sep 17 2018, 10:25 PM

Hasn't been estimated yet.

x is to be defined by @Tbayer ?

x had been set to 20% (i.e. 10:10:80) a couple of weeks ago in T200792, but @ovasileva and I think 100% couldn't hurt for this initial smaller wiki. I'll update T200792 for consistency.

Sounds good. Ready for estimation!

Note to SWATer: When doing this, could you also do T203589 ?

Change 461420 had a related patch set uploaded (by Pmiazga; owner: Pmiazga):
[operations/mediawiki-config@master] Enable Page issus A/B test for Latvian wiki

https://gerrit.wikimedia.org/r/461420

Change 461420 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable Page issuses A/B test for Latvian wiki

https://gerrit.wikimedia.org/r/461420

Mentioned in SAL (#wikimedia-operations) [2018-09-19T23:14:57Z] <catrope@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Enable Page issues A/B test on lvwiki (T204609) (duration: 00m 58s)

Latvian is now running the A/B test.
I'm seeing 0.015 events per second. Little impact on ReadingDepth.
@Tbayer can you look at the data coming in and see if there's anything that's not expected?

OK, the pageissues table just materialized in Hadoop with the data from the first hour - 19 events, 14 of which seem to be your test views. Let's wait a bit for the caches...

We have over half a day's worth of data in the table now, including from daytime hours in Latvia, and
the caches should have caught up. But the event rate remains surprisingly low (see Grafana and query below) - about 1-2 events per minute, whereas lv.m.wikipedia.org receives 70-80k views/day currently (https://stats.wikimedia.org/v2/#/lv.wikipedia.org/reading/total-page-views/normal|bar|1-Month|access~mobile-web ) or around 40-60 views/minute. Maybe content quality is very high on this wiki... (Will do further checks.)

SELECT year, month, day, hour, COUNT(*) AS events FROM event.pageissues WHERE year >0 GROUP BY year, month, day, hour ORDER BY year, month, day, hour LIMIT 10000;

year	month	day	hour	events
2018	9	19	23	19
2018	9	20	0	8
2018	9	20	1	8
2018	9	20	2	11
2018	9	20	3	8
2018	9	20	4	37
2018	9	20	5	50
2018	9	20	6	91
2018	9	20	7	67
2018	9	20	8	66
2018	9	20	9	70
2018	9	20	10	68
2018	9	20	11	67

OK, let's look at the article https://lv.m.wikipedia.org/wiki/Filozofija , which currently is tagged as lacking references and receives between 20-80 pageviews per day.

However, it hasn't generated any PageIssues events so far:

SELECT COUNT(*) FROM event.pageissues 
WHERE year >0 
AND event.pageTitle = 'Filozofija';

_c0
0
1 row selected (58.149 seconds)

On the other hand, here is a list of the 100 pages that have generated the most events so far.

Interestingly, the top entry (on Pepin the Short) doesn't contain any visible page issues - rather, it seems there is an ambox class coming from this hidden infobox template: https://lv.wikipedia.org/wiki/Veidne:Infokaste%2B

pagePageIsssues events
Pipins Īsais20
Ropažu sporta centrs13
2018. gada laikapstākļi Latvijā12
2018. gads Latvijā10
Baldones novads8
Latvijas hokejisti NHL8
Krīze7
Muskuļi7
Tutanhamons7
Ludvigs van Bēthovens6
Konflikts6
Franču buldogs6
Rihards Vāgners6
Anglijas pilsoņu karš5
Okeāniskā Zemes garoza5
Līderis5
Brocēnu novads4
Tegusigalpa4
Atmiņa4
Nīflheima4
Kapitālsabiedrība4
Dekšāres pagasts4
Mostara4
Rihards Kols4
Hammurapi4
Obsesīvi kompulsīvie traucējumi3
Žeimele3
Darbaspēka migrācija3
Korāns3
Hiromantija3
Aivars Zahārovs3
Nekromantika3
Aleksandrs fon Humbolts3
Smadzeņu satricinājums3
Viestards3
Mūspella3
2017. gada laikapstākļi Latvijā3
Bipolāri afektīvi traucējumi3
Bezvadu tīkla piekļuves punkts3
Kristaps Janičenoks3
Kalifāts3
Luijs Filips3
Autortiesības3
Kaka (futbolists)3
Ludzas novads3
Lietu tiesības3
Mikroviļņi3
Miega magone3
Laimas Muzykanti3
Ungārijas Karaliste3
Sultāne Hatidža3
Transdzimums3
Sarkanā jūra3
Makita3
Anks3
Opera3
Skolioze2
Referendums2
Serbija un Melnkalne2
Pūmipons Adunjadēts2
Aleksandrs Lielais2
Aksonometriskā projekcija2
Ilze Pētersone-Godmane2
Pārkinsona slimība2
Abrahams Abulafija2
Hūrons2
Uldevene2
Līdzskaņu mija2
Gēnu inženierija2
1812. gada karš2
Google2
Spānijas inkvizīcija2
Lietuviešu mitoloģija2
Gersicania2
ZDF2
Poēma2
Poseidons2
Lauris Liberts2
Floids Meivezers2
Faraons2
Tiešie nodokļi2
Epoksīdu polimēri2
Latvijas Jūras spēki2
Slāpekļa (III) oksīds2
Eltons Brends2
Viena bērna politika2
Terminoloģija2
Sabiedrība2
Krustziežu dzimta2
Ostapenko2
Skillet2
Oktaviāns2
Dejo ar zvaigzni! 42
Krasnodara2
Skaitītājs2
Rubas pagasts2
Obligācija2
Sindroms2
Kauguru pagasts2
Branas pils2

Data via

SELECT '|',
CONCAT('[[https://lv.m.wikipedia.org/wiki/',event.pageTitle,'|',event.pageTitle,']]') AS page, '|',
COUNT(*) AS events
FROM event.pageissues 
WHERE year >0 
GROUP BY event.pageTitle
ORDER BY events DESC LIMIT 100;

Not sure if this is relevant or not, but ~half the time I load the page I do see the page issue appear, with the treatment that is currently on production. Whereas other times it doesn't present itself at all:

lv.m.wikipedia.org_wiki_Pipins_%C4%AAsais(Pixel 2).png (1×1 px, 863 KB)
lv.m.wikipedia.org_wiki_Pipins_%C4%AAsais(Pixel 2) (1).png (1×1 px, 872 KB)

whereas with other pages on the list, e.g. Ropažu sporta centrs, it alternates between the new and current version as expected:

lv.m.wikipedia.org_wiki_Ropa%C5%BEu_sporta_centrs(Pixel 2) (1).png (1×1 px, 478 KB)
lv.m.wikipedia.org_wiki_Ropa%C5%BEu_sporta_centrs(Pixel 2).png (1×1 px, 490 KB)
In T204609#4602883, @alexhollender wrote:

Not sure if this is relevant or not, but ~half the time I load the page I do see the page issue appear, with the treatment that is currently on production. Whereas other times it doesn't present itself at all:

lv.m.wikipedia.org_wiki_Pipins_%C4%AAsais(Pixel 2).png (1×1 px, 863 KB)
lv.m.wikipedia.org_wiki_Pipins_%C4%AAsais(Pixel 2) (1).png (1×1 px, 872 KB)

whereas with other pages on the list, e.g. Ropažu sporta centrs, it alternates between the new and current version as expected:

lv.m.wikipedia.org_wiki_Ropa%C5%BEu_sporta_centrs(Pixel 2) (1).png (1×1 px, 478 KB)
lv.m.wikipedia.org_wiki_Ropa%C5%BEu_sporta_centrs(Pixel 2).png (1×1 px, 490 KB)

Correct. It's an A/B test so 50% of the time you should be bucketed into the old treatment and 50% into the new treatment.

@Jdlrobson compare the first set of screenshots with the second here: T204609#4602883 — for the first set, 50% of the time I'm getting the old treatment, 50% of the time I'm not getting any page issue showing up at all...

Interestingly, the top entry (on Pepin the Short) doesn't contain any visible page issues - rather, it seems there is an ambox class coming from this hidden infobox template: https://lv.wikipedia.org/wiki/Veidne:Infokaste%2B

Yes, this is interesting.. in the old treatment, we create a link unconditionally as the entry point to issues. In the new treatment, because the ambox is hidden (but in the page) it's impossible to click it so there is no way to view it. However we'll log events in both cases as the ambox is present.

We could reveal it in the new treatment by using an !important rule, but this would go against the wishes of editors so I'm not sure a good idea.

.issues-group-B .ambox { display: block !important; }

Not clear what to do in this situation...

OK, let's look at the article https://lv.m.wikipedia.org/wiki/Filozofija , which currently is tagged as lacking references and receives between 20-80 pageviews per day.

However, it hasn't generated any PageIssues events so far:

SELECT COUNT(*) FROM event.pageissues 
WHERE year >0 
AND event.pageTitle = 'Filozofija';

_c0
0
1 row selected (58.149 seconds)

Do we know what user agents those views are? Just want to rule out this being a problem with grade C browsers... (e.g. browsers we don't run JS)

Interestingly, the top entry (on Pepin the Short) doesn't contain any visible page issues - rather, it seems there is an ambox class coming from this hidden infobox template: https://lv.wikipedia.org/wiki/Veidne:Infokaste%2B

Yes, this is interesting.. in the old treatment, we create a link unconditionally as the entry point to issues. In the new treatment, because the ambox is hidden (but in the page) it's impossible to click it so there is no way to view it. However we'll log events in both cases as the ambox is present.

We could reveal it in the new treatment by using an !important rule, but this would go against the wishes of editors so I'm not sure a good idea.

.issues-group-B .ambox { display: block !important; }

Not clear what to do in this situation...

The template itself is hidden. I read this as editors don't want to show it. It's also hidden on desktop: https://lv.wikipedia.org/w/index.php?title=Pipins_%C4%AAsais&mobileaction=toggle_view_desktop

OK, let's look at the article https://lv.m.wikipedia.org/wiki/Filozofija , which currently is tagged as lacking references and receives between 20-80 pageviews per day.

However, it hasn't generated any PageIssues events so far:

SELECT COUNT(*) FROM event.pageissues 
WHERE year >0 
AND event.pageTitle = 'Filozofija';

_c0
0
1 row selected (58.149 seconds)

Do we know what user agents those views are? Just want to rule out this being a problem with grade C browsers... (e.g. browsers we don't run JS)

Here's a list. BTW, some events have since come in for that page,[1] but there were at least 24 pageviews (mobile web, non-bot) that should have generated events and didn't:

browserviews
Chrome Mobile11
Mobile Safari7
Samsung Internet4
Chrome Mobile iOS1
Chromium1

Data via:

SELECT user_agent_map['browser_family'] AS browser, SUM(view_count) AS views
FROM wmf.pageview_hourly
WHERE year = 2018 AND month = 9 AND day = 20 AND hour <= 15
AND page_title = 'Filozofija'
AND project = 'lv.wikipedia'
AND access_method = 'mobile web'
AND agent_type = 'user'
GROUP BY user_agent_map['browser_family']
ORDER BY views DESC LIMIT 100;
[1]
SELECT year, month, day, hour, COUNT(*) FROM event.pageissues 
WHERE year > 0 
AND event.pageTitle = 'Filozofija'
GROUP BY year, month, day, hour
ORDER BY year, month, day, hour LIMIT 10000;

year	month	day	hour	_c4
2018	9	20	16	1
2018	9	20	17	2
2018	9	20	18	2
3 rows selected (60.163 seconds)

Here is the distribution of actions so far. This does not look impossible a priori (although it would mean quite a low issue clickthrough ratio of <=2% in both control and test). So the missing events are not caused by an entire category of actions missing.

actionevents
pageLoaded1341
issueClicked14
editClicked9
modalClose7
modalInternalClicked2
modalEditClicked2

Data via

SELECT event.action AS action, COUNT(*) AS events
FROM event.pageissues 
WHERE year >0 
GROUP BY event.action
ORDER BY events DESC LIMIT 10000;

Interesting... can we rule out the following?

  • doNotTrack header was present
  • events triggered errors due to uri length and were not processed
  • events havent made it to hive yet

I'll think about other reasons in meantime...

And the distribution of values of the sectionNumbers and issuesSeverity fields looks plausible too - at least there are a lot of different kinds of combinations represented.

eventssectionnumbers
848[0]
199[0,0]
46[0,0,0]
32[2]
26[0,0,0,0]
22[9,10]
21[1]
19[3]
18[10,11]
14[4]
10[0,0,0,0,0]
8[1,2,3,4,7,8,9,10,11,12,13,14,15]
8[6]
7[8]
7[8,9]
6[5]
4[11,12]
4[15]
4[1,2,3,4,7,8,9,10,11,12,14,15]
4[6,10,11,12,13,14,16,17,18]
4[8,10]
3[1,2,4,7,9,15,16,17,18,19,20]
3[0,5]
3[7,9]
3[7]
3[2,3,6,9,10,11,12,13,14,15,16]
2[0,0,0,0,0,0]
2[11]
2[0,0,0,5]
2[10]
2[13]
2[0,1,6]
2[18]
2[16]
2[0,2]
2[1,2]
2[9]
2[1,3,4,5,8,9,10,11,12,13,14,15]
2[1,2,7,8,9,10,11,12,13,14,15]
2[5,9,10,11,12,13,14,16,17]
1[23]
1[15,17]
1[9,20,21]
1[8,10,11]
1[8,9,10,11,12,13,14,16]
1[7,9,10,11,12,13,14,15,16,18,19]
1[7,9,10]
1[3,8,9]
1[3,4,7,9,10]
1[3,4,7,8,10,11,12,13,14,15,16]
1[2,2]
1[1,3,4,7,8,9,10,11,12,14,15]
1[1,3,4,7,8,9,10,11,12,13,14,15]
1[1,2,3,4,7,8,9,10,11,14,15]
1[1,1]
1[0,5,6]
1[0,3]
1[0,1]
1[0,0,3]
1[0,0,2]
1[3,4]
eventsissuesSeverity
535["LOW"]
326["MEDIUM"]
113["DEFAULT"]
83["LOW","LOW"]
65["MEDIUM","MEDIUM"]
42["LOW","MEDIUM"]
39["DEFAULT","MEDIUM"]
26["MEDIUM","LOW"]
24["LOW","LOW","LOW"]
12["MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM"]
12["LOW","MEDIUM","LOW","MEDIUM"]
8["MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM"]
7["DEFAULT","LOW"]
7["MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM"]
7["LOW","LOW","MEDIUM"]
6["LOW","MEDIUM","LOW"]
6["MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM"]
5["MEDIUM","MEDIUM","MEDIUM"]
5["LOW","DEFAULT"]
4["MEDIUM","LOW","LOW","LOW"]
4["LOW","LOW","LOW","LOW"]
4["LOW","LOW","MEDIUM","MEDIUM","LOW"]
3["LOW","MEDIUM","LOW","LOW"]
3["MEDIUM","DEFAULT"]
3["MEDIUM","LOW","LOW"]
2["LOW","MEDIUM","LOW","LOW","MEDIUM","LOW"]
2["LOW","LOW","LOW","MEDIUM","LOW"]
2["LOW","LOW","LOW","MEDIUM"]
2["LOW","LOW","DEFAULT"]
2["LOW","MEDIUM","LOW","MEDIUM","LOW"]
2["MEDIUM","LOW","MEDIUM"]
2["DEFAULT","LOW","MEDIUM"]
2["MEDIUM","DEFAULT","MEDIUM","LOW"]
1["LOW","DEFAULT","LOW"]
1["MEDIUM","MEDIUM","LOW","LOW"]
1["LOW","MEDIUM","MEDIUM","LOW","LOW"]
1["DEFAULT","DEFAULT","DEFAULT"]
1["DEFAULT","DEFAULT"]
1["MEDIUM","MEDIUM","LOW"]
1["MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM"]
1["DEFAULT","LOW","LOW"]
1["MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM","MEDIUM"]
1["LOW","DEFAULT","LOW","LOW","LOW"]

Data via

SELECT COUNT(*) AS events, event.sectionNumbers AS sectionNumbers
FROM event.pageissues 
WHERE year >0 
GROUP BY event.sectionNumbers
ORDER BY events DESC LIMIT 100;

SELECT COUNT(*) AS events, event.issuesSeverity AS issuesSeverity
FROM event.pageissues 
WHERE year >0 
GROUP BY event.issuesSeverity
ORDER BY events DESC LIMIT 100;

No red flags in issuesVersion, isAnon, and namespaceId either.

SELECT event.issuesVersion AS issuesVersion, COUNT(*) AS events
FROM event.pageissues 
WHERE year >0 
GROUP BY event.issuesVersion;

issuesversion	events
new2018	791
old	591
2 rows selected (20.221 seconds)


SELECT event.isAnon AS isAnon, COUNT(*) AS events
FROM event.pageissues 
WHERE year >0 
GROUP BY event.isAnon;

isanon	events
false	27
true	1355
2 rows selected (20.227 seconds)


SELECT event.namespaceId AS namespaceId, COUNT(*) AS events
FROM event.pageissues 
WHERE year >0 
GROUP BY event.namespaceId;

namespaceid	events
0	1382
1 row selected (34.209 seconds)

Could you provide some example pages which have DEFAULT priority? It would be useful to verify they are behaving as expected as they were difficult to test...

Could you provide some example pages which have DEFAULT priority? It would be useful to verify they are behaving as expected as they were difficult to test...

Here is a list for all events where the first element of issuesSeverity was 'DEFAULT' (also split out by action):

pageactionevents
13. Saeimas vēlēšanaspageLoaded1
2018. gada laikapstākļi LatvijāeditClicked4
2018. gada laikapstākļi LatvijāissueClicked2
2018. gada laikapstākļi LatvijāmodalClose2
2018. gada laikapstākļi LatvijāpageLoaded33
2018. gads LatvijāpageLoaded18
2018. gadspageLoaded4
2020. gada Pasaules čempionāts hokejāpageLoaded2
2021. gada Pasaules čempionāts hokejāpageLoaded6
2022. gada Pasaules čempionāts hokejāpageLoaded5
2022. gada ziemas olimpiskās spēlespageLoaded2
2023. gada Pasaules čempionāts hokejāeditClicked1
2023. gada Pasaules čempionāts hokejāpageLoaded3
AlbīnismspageLoaded3
Alkoholiskais hepatītspageLoaded1
AnoreksijapageLoaded2
Antifosfolipīdu sindromspageLoaded1
ApendicītspageLoaded1
AterosklerozepageLoaded1
Bipolāri afektīvi traucējumipageLoaded3
Dejas simbolikapageLoaded1
DepresijapageLoaded2
Epidēmiskais parotītspageLoaded2
EtilēnglikolspageLoaded1
FosforspageLoaded1
Furnjē gangrēnapageLoaded2
Gijona kanāla sindromspageLoaded1
HepatītspageLoaded1
HipoglikēmijapageLoaded1
HipoksijapageLoaded1
Hodžkina limfomapageLoaded3
Hroma (VI) oksīdspageLoaded1
Iedzimtās sirds slimībaspageLoaded1
Karpālā kanāla sindromspageLoaded3
KlepuspageLoaded2
KontūzijapageLoaded1
Latvijas armijapageLoaded1
Latvijas upju uzskaitījumspageLoaded1
MasaliņaspageLoaded3
NistagmspageLoaded1
Obsesīvi kompulsīvie traucējumipageLoaded7
OsteoporozepageLoaded2
PlutonijspageLoaded1
PoliomielītspageLoaded1
Putnu gripapageLoaded1
Pārkinsona slimībapageLoaded3
Sirds slimībaspageLoaded1
SkoliozepageLoaded3
Slāpekļa (III) oksīdspageLoaded2
SlāpekļskābepageLoaded1
SlāpekļūdeņražskābepageLoaded1
Smadzeņu satricinājumspageLoaded4
Sociālā fobijapageLoaded1
Spēka momentspageLoaded1
Starptautiskā kosmosa stacijapageLoaded1
Sīrijas pilsoņu karšpageLoaded1
TuberkulozepageLoaded1
Urāna heksafluorīdspageLoaded1
Verdzība Romas impērijāpageLoaded2
Visu svēto dienapageLoaded1
Zobu puvepageLoaded1
Ērču encefalītspageLoaded1
ŠigelozepageLoaded1

Data via

SELECT '|',
CONCAT('[[https://lv.m.wikipedia.org/wiki/',event.pageTitle,'|',event.pageTitle,']]') AS page, '|',
event.action AS action, '|',COUNT(*) AS events, '|'
FROM event.pageissues 
WHERE year >0 
AND event.issuesSeverity[0] = 'DEFAULT'
GROUP BY event.pagetitle, event.action 
ORDER BY page, action LIMIT 10000;

Interesting... can we rule out the following?

  • doNotTrack header was present

Yes, that can be ruled out. Compare the PageIssues event rate from T204609#4601701 (or [2] below) with e.g. the print button event rate of the Print schema (lvwiki, Minerva, sampled at 10%).[1]

  • events triggered errors due to uri length and were not processed

Can be ruled out. That was a very rare occurrence even in T196904 (where the event query string contained a page title / URL twice, and we only have one page title field here). Besides, it wouldn't explain the inconsistent logging for the same page in the https://lv.m.wikipedia.org/wiki/Filozofija example.

  • events havent made it to hive yet

Super unlikely. (Other schemas, e.g. Print [1], don't seem to be seeing such a delay. And repeating the query from T204609#4601701 >13h later doesn't show any retroactive increases in the events logged.[2])

I'll think about other reasons in meantime...

[1]
SELECT year, month, day, hour, 10 * COUNT(*) AS print_buttons_shown 
FROM event.print 
WHERE year = 2018 AND month = 9 AND day = 20 
AND wiki ='lvwiki'
AND event.skin = 'minerva'
AND event.action = 'shownPrintButton'
GROUP BY year, month, day, hour 
ORDER BY year, month, day, hour LIMIT 10000;

year	month	day	hour	print_buttons_shown
2018	9	20	0	240
2018	9	20	1	20
2018	9	20	2	110
2018	9	20	3	240
2018	9	20	4	690
2018	9	20	5	1770
2018	9	20	6	2790
2018	9	20	7	2180
2018	9	20	8	2270
2018	9	20	9	2120
2018	9	20	10	2970
2018	9	20	11	2040
2018	9	20	12	1720
2018	9	20	13	1470
2018	9	20	14	1610
2018	9	20	15	2510
2018	9	20	16	2860
2018	9	20	17	3010
2018	9	20	18	3050
2018	9	20	19	2040
2018	9	20	20	1380
2018	9	20	21	760
2018	9	20	22	130
2018	9	20	23	110
[2]
SELECT year, month, day, hour, COUNT(*) AS events FROM event.pageissues WHERE year >0 GROUP BY year, month, day, hour ORDER BY year, month, day, hour LIMIT 10000

year	month	day	hour	events
2018	9	19	23	19
2018	9	20	0	8
2018	9	20	1	8
2018	9	20	2	11
2018	9	20	3	8
2018	9	20	4	37
2018	9	20	5	50
2018	9	20	6	91
2018	9	20	7	67
2018	9	20	8	66
2018	9	20	9	70
2018	9	20	10	68
2018	9	20	11	67
2018	9	20	12	55
2018	9	20	13	43
2018	9	20	14	72
2018	9	20	15	72
2018	9	20	16	77
2018	9	20	17	119
2018	9	20	18	119
2018	9	20	19	114
2018	9	20	20	64
2018	9	20	21	37
2018	9	20	22	18
2018	9	20	23	15
2018	9	21	0	7
2018	9	21	1	4
27 rows selected (46.251 seconds)

I'll think about other reasons in meantime...

@ovasileva asked if caching might be the cause. I think it's unlikely, but it is theoretically possible that the devices cache more aggressively than I realised. The rate of events does look to be increasing today (according to grafana there are more today than yesterday), so let's keep an eye on this in case we need to consider this.

Here is a look at the ratio of pageloaded events from the PageIssues to all applicable views on lvwiki (more precisely, mobile web (-domain) pageviews to mainspace pages, excluding spider views).

For an (imperfect) comparison, per T201123#4607488 around 10% of articles on lvwiki have Ambox issues.

I would say that a ratio around 1-2% like observed for much of yesterday (Sep 20) is implausibly low, even if one assumes that pages with issues are less popular. It does seem to have increased today though, so perhaps we need to check our assumptions about caching.

The next step might be to limit this to a known set of pages with issues (like Filozofija above), for a more direct comparison.

date & hour (UTC)pageLoaded eventsall mobile mainspace viewsratio
2018-09-20 00h82180.0367
2018-09-20 01h82160.037
2018-09-20 02h112140.0514
2018-09-20 03h85240.0153
2018-09-20 04h3414420.0236
2018-09-20 05h5032480.0154
2018-09-20 06h8745210.0192
2018-09-20 07h6747890.014
2018-09-20 08h6547870.0136
2018-09-20 09h7044110.0159
2018-09-20 10h6839330.0173
2018-09-20 11h6638670.0171
2018-09-20 12h5530520.018
2018-09-20 13h3829090.0131
2018-09-20 14h7233890.0212
2018-09-20 15h7235140.0205
2018-09-20 16h7740710.0189
2018-09-20 17h11849180.024
2018-09-20 18h11348490.0233
2018-09-20 19h11137540.0296
2018-09-20 20h6322050.0286
2018-09-20 21h3712330.03
2018-09-20 22h185410.0333
2018-09-20 23h133700.0351
2018-09-21 00h72590.027
2018-09-21 01h42650.0151
2018-09-21 02h62290.0262
2018-09-21 03h185250.0343
2018-09-21 04h5414430.0374
2018-09-21 05h10832520.0332
2018-09-21 06h13146270.0283
2018-09-21 07h11644300.0262
2018-09-21 08h10642470.025
2018-09-21 09h11836170.0326
2018-09-21 10h11231860.0352
2018-09-21 11h8727120.0321
2018-09-21 12h11426170.0436
2018-09-21 13h9421750.0432
2018-09-21 14h9621790.0441
2018-09-21 15h7722830.0337
2018-09-21 16h10424880.0418
2018-09-21 17h16728610.0584
2018-09-21 18h22331800.0701
2018-09-21 19h20231860.0634
2018-09-21 20h12621030.0599

Data via


SELECT CONCAT(views_list.year,'-',
  LPAD(views_list.month,2,'0'),'-',
  LPAD(views_list.day,2,'0'),' ',
  LPAD(views_list.hour,2,'0'),'h') AS datehour, 
pageloaded_count_list.pageloaded_events AS pageloaded_events, 
views_list.views AS all_pageviews,
ROUND(pageloaded_count_list.pageloaded_events/views_list.views,4) AS ratio 
FROM (
  SELECT year, month, day, hour, COUNT(*) AS pageloaded_events 
  FROM event.pageissues 
  WHERE year = 2018 AND month = 9 AND day >= 20 
  AND event.action = 'pageLoaded'
  GROUP BY year, month, day, hour) AS pageloaded_count_list
JOIN (
  SELECT year, month, day, hour, SUM(view_count) AS views
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month = 9 AND day >= 20
  AND project = 'lv.wikipedia'
  AND access_method = 'mobile web'
  AND agent_type = 'user'
  AND namespace_id = 0
  GROUP BY year, month, day, hour) AS views_list
ON 
  pageloaded_count_list.hour = views_list.hour AND 
  pageloaded_count_list.day = views_list.day AND 
  pageloaded_count_list.month = views_list.month AND 
  pageloaded_count_list.year = views_list.year 
ORDER BY datehour LIMIT 10000;

For the record: @Jdlrobson has found the likely reason for the initially low event rate ("Minerva A/B tests are not subject to HTML caching time. Config added inside SkinMinerva is subject to the rules of HTML caching and can take several days ..."). The fix is being worked on at T205355: A/B config flag should be subject to ResourceLoader caching rules not HTML caching rules

Indeed, the rates since the weekend looks more plausible (if still below the 10%):

datepagetokensall_pageviewsratio
2018-09-201329669750.0198
2018-09-212299546920.042
2018-09-223422479030.0714
2018-09-234633556730.0832
2018-09-245545757120.0732
2018-09-25497690.0637

Data via

SELECT CONCAT(views_list.year,'-',
  LPAD(views_list.month,2,'0'),'-',
  LPAD(views_list.day,2,'0'),' ') AS date, 
pagetoken_count_list.pagetoken_counts  AS pagetokens, 
views_list.views AS all_pageviews,
ROUND(pagetoken_count_list.pagetoken_counts/views_list.views,4) AS ratio 
FROM (
  SELECT year, month, day, COUNT(DISTINCT event.pagetoken) AS pagetoken_counts 
  FROM event.pageissues 
  WHERE year = 2018 AND month = 9 AND day >= 20 
  GROUP BY year, month, day) AS pagetoken_count_list
JOIN (
  SELECT year, month, day, SUM(view_count) AS views
  FROM wmf.pageview_hourly
  WHERE year = 2018 AND month = 9 AND day >= 20
  AND project = 'lv.wikipedia'
  AND access_method = 'mobile web'
  AND agent_type = 'user'
  AND namespace_id = 0
  GROUP BY year, month, day) AS views_list
ON 
  pagetoken_count_list.day = views_list.day AND 
  pagetoken_count_list.month = views_list.month AND 
  pagetoken_count_list.year = views_list.year 
ORDER BY date LIMIT 10000;
Tbayer renamed this task from Turn on page issues A/B test for Latvian wikipedia to Turn on page issues A/B test for Latvian Wikipedia, and conduct data checks.Sep 28 2018, 12:12 AM
Tbayer updated the task description. (Show Details)

Re-checking the ratio of pageloaded events per pageview after the fix for T205355 has been deployed:
This looks much more plausible now than earlier (T204609#4607546), with rates around the estimated 10%.

datehourpageloaded_eventsall_pageviewsratio
2018-09-25 00h245590.0429
2018-09-25 01h252100.119
2018-09-25 02h133290.0395
2018-09-25 03h445380.0818
2018-09-25 04h14214440.0983
2018-09-25 05h24931180.0799
2018-09-25 06h40747340.086
2018-09-25 07h41449330.0839
2018-09-25 08h56148190.1164
2018-09-25 09h47050650.0928
2018-09-25 10h48755720.0874
2018-09-25 11h39844640.0892
2018-09-25 12h38438230.1004
2018-09-25 13h23634250.0689
2018-09-25 14h28538470.0741
2018-09-25 15h34845710.0761
2018-09-25 16h35455360.0639
2018-09-25 17h49866690.0747
2018-09-25 18h45461650.0736
2018-09-25 19h37546110.0813
2018-09-25 20h20128180.0713
2018-09-25 21h10914590.0747
2018-09-25 22h585960.0973
2018-09-25 23h304010.0748
2018-09-26 00h182070.087
2018-09-26 01h262340.1111
2018-09-26 02h212820.0745
2018-09-26 03h446390.0689
2018-09-26 04h12714840.0856
2018-09-26 05h32834170.096
2018-09-26 06h37245040.0826
2018-09-26 07h43155070.0783
2018-09-26 08h45449620.0915
2018-09-26 09h44545550.0977
2018-09-26 10h49753590.0927
2018-09-26 11h48446840.1033
2018-09-26 12h34038970.0872
2018-09-26 13h35136630.0958
2018-09-26 14h31737120.0854
2018-09-26 15h46950400.0931
2018-09-26 16h57157240.0998
2018-09-26 17h62763050.0994
2018-09-26 18h57758980.0978
2018-09-26 19h44744630.1002
2018-09-26 20h29827200.1096
2018-09-26 21h13012270.1059
2018-09-26 22h486260.0767
2018-09-26 23h413480.1178
2018-09-27 00h232630.0875
2018-09-27 01h162310.0693
2018-09-27 02h232620.0878
2018-09-27 03h335070.0651
2018-09-27 04h16113970.1152
2018-09-27 05h25831450.082
2018-09-27 06h42948270.0889
2018-09-27 07h52150050.1041
2018-09-27 08h43148660.0886
2018-09-27 09h41850080.0835
2018-09-27 10h55051460.1069
2018-09-27 11h45941280.1112
2018-09-27 12h31131420.099
2018-09-27 13h30433880.0897
2018-09-27 14h31132610.0954
2018-09-27 15h38037650.1009
2018-09-27 16h42245230.0933
2018-09-27 17h47054820.0857
2018-09-27 18h55752730.1056
2018-09-27 19h38440500.0948
2018-09-27 20h21323910.0891
2018-09-27 21h17913030.1374
2018-09-27 22h595090.1159

Here is the distribution of actions so far. This does not look impossible a priori (although it would mean quite a low issue clickthrough ratio of <=2% in both control and test). [...]

After some other checks that looked fine (will post the detailed results here), I happened to look at the frequency of action types again.[1]

It turns out that the issue clickthrough rate is now even lower than in the above initial check - even below the rate of (generic) edit button clicks, which is quite surprising. Concretely, e.g. 0.40% in test and 0.17% in control for top (page-level) issues.[2] (Perhaps one reason why the ratio was higher than normal in T204609#4604633 was that that initial check only covered a relatively small number of events where our own testing with deliberate issue-clicking was still impacting the clickthrough ratio.)

This rate is similarly low across browsers[3] (i.e. it's not a bug where one major browser fails to send issueClicked events). And on the other hand we hand-tested earlier in QA that issueClicked events were being sent correctly for several browser/OS combinations. So it seems that this is real and that people may in fact tapping on the issues notice even less often than on the generic edit button, even though we might have expected otherwise based on the received wisdom that consumption actions (reading) are generally orders of magnitude more frequent than contribution actions (editing).

I.e. the above observation should not hold up the launch of the full A/B test planned for today (T200792), although it may affect some of the sample sizes needed (in the earlier sample size estimate at T200792#4489268 I had assumed that editClicked would be less frequent that issueClicked).

[1]
SELECT event.action AS action, COUNT(*) AS events
FROM event.pageissues 
WHERE year = 2018 AND month = 9 AND day >= 27
GROUP BY event.action
ORDER BY events DESC LIMIT 10000;

action	events
pageLoaded	25710
editClicked	109
issueClicked	75
modalClose	18
modalEditClicked	4
modalInternalClicked	2
6 rows selected (696.26 seconds)
[2]
SELECT event.issuesVersion AS version, 
SUM(IF(event.action = 'issueClicked', 1, 0)) / SUM(IF(event.action = 'pageLoaded', 1, 0)) AS issuesclickratio,
SUM(IF(event.action = 'pageLoaded', 1, 0)) AS pageloaded_events
FROM event.pageissues 
WHERE year = 2018 AND month = 9 AND day >= 27
AND event.sectionnumbers[0] = 0
GROUP BY event.issuesVersion;

version	issuesclickratio	pageloaded_events
new2018	0.003957528957528957	10360
old	0.001713796058269066	10503
2 rows selected (24.093 seconds)
[3]
SELECT useragent.browser_family AS browser, 
SUM(IF(event.action = 'issueClicked', 1, 0)) / SUM(IF(event.action = 'pageLoaded', 1, 0)) AS issuesclickratio,
SUM(IF(event.action = 'pageLoaded', 1, 0)) AS pageloaded_events
FROM event.pageissues 
WHERE year = 2018 AND month = 9 AND day >= 27
GROUP BY useragent.browser_family
ORDER BY pageloaded_events DESC LIMIT 50;

browser	issuesclickratio	pageloaded_events
Chrome Mobile	0.0025455796353629173	14535
Mobile Safari	0.002388263391334016	5862
Samsung Internet	0.00358719646799117	3624
Chrome	0.008893280632411068	1012
Android	0.0	116
Chrome Mobile iOS	0.010416666666666666	96
BingPreview	0.0	91
Firefox Mobile	0.0	90
Chrome Mobile WebView	0.0	67
Facebook	0.0	59
Opera Mobile	0.0	51
[...]
24 rows selected (44.168 seconds)

@Tbayer is going to run one more data check and then this will then be resolved.

So about the rest of the result I had mentioned above that looked fine:

At the beginning of a pageview, the instrumentation is supposed to send a pageLoaded event to both the PageIssues and ReadingDepth schemas. Because ReadingDepth is limited to (in particular) browsers that support the Page Visibility API, we expect the ReadingDepth event to be missing in some cases. But on the other hand, a ReadingDepth pageLoaded event should always be accompanied by a PageIssues pageLoaded event in the logged data.

This was confirmed in this check: More than 99% of pageviews recorded (via either schema) had the PageIssues event sent. 24% had the ReadingDepth event missing, but that looked like it could plausibly come from browsers not supporting Page Visibility. That's why we declared this as checked last week and went ahead with T200792: [EPIC] Run A/B test on page issues (Farsi, Japanese, Russian, English)

However, later last week I got an uneasy feeling and took closer look at this data per browser. And unfortunately the missing ReadingDepth events seem not to be confined to those browsers. This means that the problem we had observed for Safari in QA seems more widespread than assumed - I have reopened the corresponding task and posted the new per-browser data there: T204143: ReadingDepth events are not being sent in browsers where navigator.sendBeacon should be supported but in practice isn't

SET hive.mapred.mode=nonstrict;
SELECT
ROUND(100*SUM(IF((pi.pageToken IS NOT NULL) AND (rd.pageToken IS NOT NULL),1,0))/SUM(1),2) AS both , 
ROUND(100*SUM(IF((pi.pageToken IS NOT NULL) AND (rd.pageToken IS NULL),1,0))/SUM(1),2) AS only_pi, 
ROUND(100*SUM(IF((pi.pageToken IS NULL) AND (rd.pageToken IS NOT NULL),1,0))/SUM(1),2) AS only_rd, 
SUM(1) AS all
FROM (
  SELECT event.pageToken AS pageToken
  FROM event.pageissues 
  WHERE year = 2018 AND month = 9 AND day >= 25
  AND event.action = 'pageLoaded') AS pi
FULL OUTER JOIN (
  SELECT event.pageToken AS pageToken
  FROM event.readingdepth
  WHERE year = 2018 AND month = 9 AND day >= 25
  AND event.action = 'pageLoaded'
  AND ( event.page_issues_a_sample OR event.page_issues_b_sample )) AS rd
ON pi.pageToken = rd.PageToken;

both	only_pi	only_rd	all
75.32	24.1	0.58	39978