This is a continuation - with modifications - of a test that started in https://phabricator.wikimedia.org/T121543.
This ticket is to turn off the test after verification that we collected the data we need.
This is a continuation - with modifications - of a test that started in https://phabricator.wikimedia.org/T121543.
This ticket is to turn off the test after verification that we collected the data we need.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | mpopov | T137170 Part Deux: TextCat A/B test for Language Identification - analysis of results | |||
Resolved | EBernhardson | T137169 Part Deux: TextCat A/B test for Language Identification - turn off test | |||
Resolved | mpopov | T137168 Part Deux: TextCat A/B test for Language Identification - ensure test is going well | |||
Resolved | EBernhardson | T137167 Part Deux: TextCat A/B test for Language Identification - create and deploy | |||
Resolved | EBernhardson | T137163 Part Deux: TextCat A/B test for Language Identification - specification |
What does the "Part Deux" prefix in those tasks mean? Could such things be explained please?
@Aklapper - I added into the title to mean that this will be our second pass at this particular test.
@EBernhardson The counts aren't great so we should keep the test on but I might have to start analyzing the data today that we do have because next week I have a conference. I can make a preliminary report and then update it next Friday after we've had an extra week of data. Thoughts?
date | enwiki clicks | interwiki clicks |
2016-06-16 | 7 | 0 |
2016-06-17 | 1615 | 15 |
2016-06-18 | 1389 | 3 |
2016-06-19 | 1462 | 12 |
2016-06-20 | 2096 | 3 |
2016-06-21 | 1870 | 6 |
2016-06-22 | 1754 | 12 |
date | sessions that clicked on an enwiki result | sessions that clicked on an interwiki result |
2016-06-16 | 6 | 0 |
2016-06-17 | 1196 | 7 |
2016-06-18 | 966 | 3 |
2016-06-19 | 1036 | 10 |
2016-06-20 | 1362 | 3 |
2016-06-21 | 1342 | 4 |
2016-06-22 | 1326 | 7 |
Seems like we need to keep the test running. Would it make sense to up the sample rate (maybe double or triple if it isn't already too high)?
I don't have any thoughts on doing the analysis early. On the one hand, it's all we'll get before the end of the quarter (unless we can squeak by doing it next Friday). On the other hand, the samples look too small to be useful right now.
I agree, the counts are much too small to make a real informed analysis. I'm for upping the sample rate or just keep it going for another week-ish and hope that we get more useful data to analyze.
If you have time, @mpopov, to do a quick analysis of what we have right now and then if you can update it next Friday, that's cool too.
Do we have a date when we expect this to be turned off? It's been nearly three weeks now since the test was enabled.
Change 298896 had a related patch set uploaded (by EBernhardson):
Revert "Textcat search satisfaction subtest for multiple wikis"
Change 298896 merged by jenkins-bot:
Revert "Textcat search satisfaction subtest for multiple wikis"
Change 299088 had a related patch set uploaded (by EBernhardson):
Revert "Textcat search satisfaction subtest for multiple wikis"
Change 299088 merged by jenkins-bot:
Revert "Textcat search satisfaction subtest for multiple wikis"
Mentioned in SAL [2016-07-14T23:10:59Z] <ebernhardson@tin> Synchronized php-1.28.0-wmf.10/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: T137169: Turn of TextCat A/B test (duration: 00m 34s)