Page MenuHomePhabricator

Part Deux: TextCat A/B test for Language Identification - turn off test
Closed, ResolvedPublic

Description

This is a continuation - with modifications - of a test that started in https://phabricator.wikimedia.org/T121543.

This ticket is to turn off the test after verification that we collected the data we need.

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

What does the "Part Deux" prefix in those tasks mean? Could such things be explained please?

@Aklapper - I added into the title to mean that this will be our second pass at this particular test.

debt triaged this task as Medium priority.Jun 16 2016, 5:26 PM

@EBernhardson The counts aren't great so we should keep the test on but I might have to start analyzing the data today that we do have because next week I have a conference. I can make a preliminary report and then update it next Friday after we've had an extra week of data. Thoughts?

dateenwiki clicksinterwiki clicks
2016-06-1670
2016-06-17161515
2016-06-1813893
2016-06-19146212
2016-06-2020963
2016-06-2118706
2016-06-22175412
datesessions that clicked on an enwiki resultsessions that clicked on an interwiki result
2016-06-1660
2016-06-1711967
2016-06-189663
2016-06-19103610
2016-06-2013623
2016-06-2113424
2016-06-2213267

@EBernhardson The counts aren't great so we should keep the test on but I might have to start analyzing the data today that we do have because next week I have a conference. I can make a preliminary report and then update it next Friday after we've had an extra week of data. Thoughts?

cc @debt @Deskana @TJones

Seems like we need to keep the test running. Would it make sense to up the sample rate (maybe double or triple if it isn't already too high)?

I don't have any thoughts on doing the analysis early. On the one hand, it's all we'll get before the end of the quarter (unless we can squeak by doing it next Friday). On the other hand, the samples look too small to be useful right now.

I agree, the counts are much too small to make a real informed analysis. I'm for upping the sample rate or just keep it going for another week-ish and hope that we get more useful data to analyze.

If you have time, @mpopov, to do a quick analysis of what we have right now and then if you can update it next Friday, that's cool too.

Change 298896 had a related patch set uploaded (by EBernhardson):
Revert "Textcat search satisfaction subtest for multiple wikis"

https://gerrit.wikimedia.org/r/298896

patch up, once reviewed and merged can have it swatted out

Change 298896 merged by jenkins-bot:
Revert "Textcat search satisfaction subtest for multiple wikis"

https://gerrit.wikimedia.org/r/298896

Change 299088 had a related patch set uploaded (by EBernhardson):
Revert "Textcat search satisfaction subtest for multiple wikis"

https://gerrit.wikimedia.org/r/299088

Change 299088 merged by jenkins-bot:
Revert "Textcat search satisfaction subtest for multiple wikis"

https://gerrit.wikimedia.org/r/299088

Mentioned in SAL [2016-07-14T23:10:59Z] <ebernhardson@tin> Synchronized php-1.28.0-wmf.10/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137169: Turn of TextCat A/B test (duration: 00m 34s)