Page MenuHomePhabricator

Implement A/B test to measure CirrusSearch "opening_text" performance.
Closed, ResolvedPublic

Description

This A/B test will modify our current "read more" request to use these additional parameters:
cirrusMltUseFields=yes&cirrusMltFields=opening_text

We'll need to measure:

  • Clickthrough rate with the new parameters vs. the old way.
  • Perceived latency with the new parameters vs. the old way.

Event Timeline

Dbrant created this task.Feb 1 2016, 3:12 PM
Dbrant raised the priority of this task from to Needs Triage.
Dbrant updated the task description. (Show Details)
Dbrant moved this task to Current Sprint on the Wikipedia-Android-App-Backlog board.
Dbrant added a subscriber: Dbrant.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 1 2016, 3:12 PM
dcausse added a subscriber: dcausse.Feb 1 2016, 3:21 PM

I strongly suggest to add cirrusBoostLinks=no to the list of params. The full param list should be:

cirrusMltUseFields=yes&cirrusMltFields=opening_text&cirrusBoostLinks=no

Dbrant claimed this task.Feb 1 2016, 4:25 PM
Dbrant moved this task from To Do to Doing on the Mobile-App-Android-Sprint-75-Rhenium board.

Change 267695 had a related patch set uploaded (by Dbrant):
Implement A/B test to measure CirrusSearch "opening_text" performance.

https://gerrit.wikimedia.org/r/267695

Change 267695 merged by jenkins-bot:
Implement A/B test to measure CirrusSearch "opening_text" performance.

https://gerrit.wikimedia.org/r/267695

Deskana added a subscriber: Deskana.

Thanks for working on this. :-)

Did general search regression/comparison testing with 2.1.141-alpha-2016-02-12 build.

Dbrant closed this task as Resolved.Feb 15 2016, 5:40 PM
Dbrant moved this task from QA Signoff to Done on the Mobile-App-Android-Sprint-75-Rhenium board.

Is the outcome (raw data/evaluation) of the A/B test still available? We would like to use it as reference for our Citolytics A/B test.

It looks like this wasn't a super rigorous study, but the rolled up data is at https://docs.google.com/spreadsheets/d/1BFsrAcPgexQyNVemmJ3k3IX5rtPvJ_5vdYOyGgS5R6Y/edit#gid=312723487

I re-ran one of the queries used there and it looks like we still have the raw data in our analytics database. I'm not completely sure, but it's probably possible to make this data available in some form. I'll have to check with @mpopov about what the procedure would be for that.

@EBernhardson Thanks for pointing to the spreadsheet. It would be really great, if you can make the (anonymized) raw data available so that we can prepare our study.

@EBernhardson @mpopov Any news on releasing the data?

@mschwarzer can you be more specific what data you need. Maybe it would be sufficient to get the schema information. Thus you can write your queries to test if the data you need is recorded.

Nuria added a subscriber: Nuria.Nov 21 2016, 6:20 PM

Actually this A/B test has many issues that I think warrant holding on making that data available as any kind of example. Namely, we do not even know if the variable we are measuring has a higher intrinsic variation than the one measured in the test. cc @EBernhardson @mpopov

Also, as a side note any data made available in granular form has to be vetted by Security and Legal. Making aggregated results available is OK.

Can you provide the database schema where the data is stored? Then, I can create a query for the aggregation.