Page MenuHomePhabricator

Interleaved results A/B test: check that data is flowing the way we expect
Closed, ResolvedPublic

Description

Let's be sure that the data we're expecting in this 2-way test is coming in like what we expect it to.

Event Timeline

debt claimed this task.
debt moved this task from Backlog to Done on the Discovery-Analysis (Current work) board.

This was already done.

Data is actually much smaller than expected. At 1:2000 we collect ~15k sessions per day of full text. Sampling was increased to 1:500 and 75% of sessions were directed into the test, but the results were 15k sessions per day for dashboards and only ~600 sessions per day recording events into the test (when it should have been 45k). Not clear yet what happened.

Not clear yet whats gone wrong here. I've poked at the raw event logging events, inside the eventlogging-client-side kafka topic, and the same ratio of events by subTest is there. Looking at the webrequests table in hive shows the same ratio of events by subTest as well. This suggests the events are not being sent, or are being thrown away incredibly early in the pipeline (unlikely).

The breakdown of events that did get logged by either OS or Browser do not suggest we are failing on most browsers and only working in specific cases. Something else is going on but its really not clear what. Will continue investigating.

Some documentation from event logging is suspicious, but i also think this might not be the case anymore, because i see events making it through with a payload > 1kB. While our search result page events are >1kB, other events like 'visitPage' are much smaller so those should have still come through event if the search result page events were rejected. Also based on the doc the events should have been truncated, which would still be detectable, rather than completely disapearing:

There is a limitation of the size of individual EventLogging events due the underlying infrastructure (limited size of urls in Varnish's varnishncsa/ varnishlog, as well as Wikimedia UDP packets). For the purpose of size limitation, an "entry" is a /beacon request URL containing urlencoded JSON-stringified event data. Entries longer than 1014 bytes are truncated. When an entry is truncated, it will fail validation because of parsing (as the result is invalid JSON).

EBernhardson claimed this task.
EBernhardson moved this task from Done to In progress on the Discovery-Analysis (Current work) board.

Mentioned in SAL (#wikimedia-operations) [2017-08-18T00:18:48Z] <ebernhardson@tin> Synchronized php-1.30.0-wmf.14/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T171213: Increase sampling rate of cirrus satisfaction schema (duration: 00m 44s)

Change 372490 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/WikimediaEvents@master] Increase enwiki sampling of cirrus to 1k session per day per bucket

https://gerrit.wikimedia.org/r/372490

What went wrong here is i completely mis-estimated the event counts, by making the incorrect assumption enwiki made up the majority of logged search sessions. Because we vary our sampling by wiki enwiki makes up < 2% of the sessions we record.

Initial sampling rate: 1:2000
Sessions collected per day: ~250
Estimated sessions per day: 500,000

Desired sessions per bucket per day: 1000
Number of buckets: 6

Total sessions sampled: 250 + (6*1000) = 6250
New sampling rate = 500000/6250 = 1 in 80
% of sessions going into sub test: 6000/6250 = 0.96

I'll be deploying this update in a few minutes, and we should collect 1k events per day per bucket. I have no clue how many we need, but it means analysis of the previously collected data could go forward if we think it's enough.

Change 372490 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Increase enwiki sampling of cirrus to 1k session per day per bucket

https://gerrit.wikimedia.org/r/372490

Change 372491 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/WikimediaEvents@wmf/1.30.0-wmf.14] Increase enwiki sampling of cirrus to 1k session per day per bucket

https://gerrit.wikimedia.org/r/372491

Change 372491 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@wmf/1.30.0-wmf.14] Increase enwiki sampling of cirrus to 1k session per day per bucket

https://gerrit.wikimedia.org/r/372491

Mentioned in SAL (#wikimedia-operations) [2017-08-18T00:48:42Z] <ebernhardson@tin> Synchronized php-1.30.0-wmf.14/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T171213: Increase sampling rate of cirrus satisfaction schema (again) to 1k per bucket per day (duration: 00m 44s)