After the TextCat A/B test is turned off (see T134319), the data should be analysed to see whether the test had a significant impact.
Description
Event Timeline
A couple things (although certainly more) i think we could look at:
Click through to the alternate wiki
Query reformulation after being showed alternate wiki results
This topic came up in a discussion today with @EBernhardson and @dcausse.
Click through to the alternate wiki
Query reformulation after being showed alternate wiki results
- A clarification: tracking query reformulation by the user is interesting and useful in its own right, as a way of getting possible alternative versions of a query (e.g., for automatic correction). In this case, the idea is that a reformulated query from the user without a clickthrough to another wiki after presenting other-language cross-wiki results indicates that the results were not useful.
Other ideas that came up:
- looking at satisfaction metrics for all queries identified as being in another language in one big bucket, vs looking at by-language buckets. (e.g., on enwiki, results in Spanish/from eswiki are good, but results in French/from frwiki are not.)
- looking at satisfaction metrics for queries based on number of cross-wiki results (1 result may be a fluke, 5000 results means the language is probably right).
I'll also try to get others to take a peek over here and add more.
Perhaps interesting, but maybe not a factor in deciding to keep the feature:
- % of zero result requests that now get results
- % of requests that were provided inter-wiki results that click on one
@TJones Hm… Do you have suggestions for the threshold we can use to determine this on the whole dataset? We won't be able to look at each of 100K+ sessions individually.
Note to future @mpopov: the extra data field in the TSS2 table will have 3 values (actually detected language, wiki queried, and number of results) that will need to be separated into 3 columns.
Cannot proceed with analysis as data is too faulty to be reliable. We will fix the EL and relaunch the test. See follow-up: T137158