Page MenuHomePhabricator

Provide A/B test for item suggestor
Closed, DeclinedPublic

Description

Story: As a PM/UX Designer we want to know if a new feature performs better/worse/different than an old one.

Context: To properly evaluate the impact, we need random assignment

TODO:

  • Randomly assign to the old-algorithm or the new-algorithm group in an arbitrary split (e.g. 50/50%)
  • Save the assignment group in a cookie

See also T170392: Create gadget that enables the use of the elastic search backend for the entity selector for an opt-in / opt-out variant.

Event Timeline

I collaborated with @Addshore the last time we created such a feature. We did the random assignment by userID % 2, I think

Whether we want to start with randomized A/B testing, or rather an opt-in or opt-out testing, needs to be discussed with @Lydia_Pintscher and @Lea_Lacroix_WMDE.

In any case, we should hive a hint to users to avoid confusion. Perhaps we can add an indicator to the popup showing the search result. Something like "this result was generated using experimental feature X" with a link to more information.

Jan_Dittrich added a comment.EditedJul 17 2017, 12:11 PM

@daniel: My preferred way would be having two phases:

  1. Opt-in for testing the new algorithm to have the possibility for people who want to use a (possibly rough) alpha. This is for making sure it works at least OK.
  2. When we know that there are no severe problems, we do an AB-test. this is for adjusting the new algorithm and seeing in which areas it performs better or worse.

Since both should work at least OK, I would not say "experimental" but "new" or so, since "experimental" sounds kind of dangerous. I like the indicate-the-test idea, though.

Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptJul 25 2017, 5:11 PM

Are we still doing this? I understood that we decided to enable Elastic option for production.

Lydia_Pintscher closed this task as Declined.Oct 25 2017, 9:06 AM

Jep :)

I understood that we decided to enable Elastic option for production.

Jep :)

Since there were concerns that the switch might cause trouble or be worse than the "old" algorithm: How did these concerns resolve? Was there research done? Did people feel it is actually not a problem anymore? etc.

People who complained the most about the old search said they think the new one is considerably better based on Stas' test page.

Great. Thanks for documenting this.

Am 25.10.2017 5:26 nachm. schrieb "Lydia_Pintscher" <
no-reply@phabricator.wikimedia.org>:

Lydia_Pintscher added a comment.

People who complained the most about the old search said they think the
new one is considerably better based on Stas' test page.

*TASK DETAIL*
https://phabricator.wikimedia.org/T170549

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Lydia_Pintscher
*Cc: *Smalyshev, PokestarFan, Lea_Lacroix_WMDE, Addshore, Aklapper,
daniel, Charlie_WMDE, James_Budday, Aleksey_WMDE, Jonas, Lydia_Pintscher,
Jan_Dittrich, GoranSMilovanovic, QZanden, Wikidata-bugs, aude, Mbch331