The data collected by the completion suggester experiement doesn't make sense and is all over the place. Starting on Sept 10 the TestSearchSatisfaction2 and CompletionSuggestion experiments started seeing what should be unique 64 bit numbers coming from multiple ip addresses. Figure out why this is happening and fix it.
- Mentioned In
- T111858: Analyze results of A/B test on suggester (on or after 2015-09-22)
T111857: Verify that data from A/B test on suggester is coming through correctly (on or after 2015-09-09)
rEWMV6e1c2d109135: Update CompletionSuggestion bucket selection
rMWfc2c957be302: Updated mediawiki/core Project: mediawiki/extensions/WikimediaEvents…
rEWMVdda2eae04c5b: Update CompletionSuggestion bucket selection
rMEXT3a59479106d9: Updated mediawiki/extensions Project: mediawiki/extensions/WikimediaEvents…
Looks like dan found the issue today, reported at https://lists.wikimedia.org/pipermail/analytics/2015-September/004285.html
So this basically means we need to throw away the clientIp information until this can be fixed.
Can we get useful test data without this value? We can still correlate together events by the same user on the same page, we just can't correlate them together across pages (but chances are they won't be opted into the test more than once).
Are there other oddities in the data we can't explain?
Update: we discussed this on IRC and arrived at the conclusion that we can assume relative independence of sets of events. Which is to say, given our low sampling rates, we are not likely to see logs of sessions from the same users.
To be valid I think we have to start the test over as of when the adjusted schema was deployed today. There were a few changes made to bucketing (that will also help on other tests going forward) so the data moving forward isn't directly comparable with the data prior. Maybe? Not entirely confident but putting it out there