Page MenuHomePhabricator

A/B test setup for search changes
Closed, ResolvedPublic5 Estimated Story Points

Description

Background

We would like to do two A/B test on the changes planned to the search widget as a part of the desktop improvements project:

  • Moving the search bar to the header of the page (target 2.5% overall increase in search sessions initiated, monitor and report on search sessions completed)
  • Switching the current search widget to the new search widget (target 2.5% overall increase in search sessions initiated, monitor and report on search sessions completed)
  • Search changes will be rolled back if we see a 5% overall decrease in search sessions initiated
  • Individual test will be performed per wiki for logged-in user only (nice to have: all users)

Additional background in T256100, T249366, T251740, and T257698.

Acceptance Criteria

  • Ensure we have the ability to perform the above A/B tests
  • For officewiki and testwiki only, the default experience should be the test experience (widget in new location), rather than the control

Proposed Implementation

See T259250#6389243.

developer notes

  • Logged out not worth implementing - too much work. Maybe impossible
  • Currently the search in header feature is controlled by a feature flag $wgVectorIsSearchInHeader . This would need to be changed to use SearchInHeaderLookup (like SkinVersionLookup). For logged in users, the A/B test would need to be applied based on User ID using a mod operator. See T259250#6418936.
  • SearchSatisfaction schema needs to be modified to check the class on the body. Possibly send skin version 3. Done in T256100

QA Results - Prod

ACStatusDetails
1T259250#6529304
2T259250#6529304

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
Resolvedovasileva
Resolvedovasileva
Resolvedphuedx
Resolvedovasileva
ResolvedMNeisler
Resolvedovasileva
Resolvedovasileva
Resolvedovasileva
Resolvedovasileva
Resolvedovasileva
ResolvedEdtadros
DeclinedNone
ResolvedJdlrobson
Duplicateovasileva
ResolvedMNeisler
Resolved jlinehan
ResolvedTgr
ResolvedNone
Resolved jlinehan
Resolved jlinehan
ResolvedOttomata
ResolvedOttomata
ResolvedSpikeJdlrobson
Resolved jlinehan
OpenNone
Resolved jlinehan
Resolved jlinehan
Resolved jlinehan
Resolved jlinehan
ResolvedJdlrobson
Resolved jlinehan
Resolved jlinehan
ResolvedTgr
Resolvedcolewhite
DeclinedNone
ResolvedNone
Resolved jlinehan
ResolvedTgr
ResolvedBUG REPORTTgr
OpenNone
OpenNone
Resolvedphuedx
Resolved alexhollender_WMF
Resolvedovasileva
DuplicateNone
Resolved alexhollender_WMF
ResolvedMNeisler

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

We discussed searchSuggest.js in super-happy-dev-time today. As I recall, @phuedx, @Jdlrobson, @Jdrewniak, and @bearND were in attendance. Here are the notes that seem relevant to this task:

  • searchSuggest.js is currently shipped regardless of the search implementation configuration flag (T257706). It's also loaded with the page independent of search and search user focus.
  • searchSuggest.js will either need to be made compatible with the new search or prevented from loading. In the case of the former, that would probably cover T257698 and T249366. In the case of the latter, searchSuggest will need a disable mechanism.
  • searchSuggest.js is complex and controls sampling.
  • searchSuggest.js is complex and controls sampling.

Thank you for writing up these notes, @Niedzielski. To expand on the above a little, the SearchSatisfaction instrumentation can:

  • Persist state across a search session
    • And enforce arbitrary TTLs on certain pieces of state
  • Sample and bucket users based on hard-coded configuration
  • Log a checkin event when the page is visible for 10, 20, 30, 40, 50, 60, 90, 120, 150, 180, 210, 240, 300, 360, and 420 seconds , if you came from a SERP
  • Log clicks to sister project search results on a SERP
  • Log data about autocomplete results and clicks on those results
  • Log clicks to autocomplete results
  • Log visits to pages after clicking on an a autocomplete result or a result on a SERP
  • Log JavaScript errors using the SearchSatisfactionErrors schema while the instrumentation is active

At the time of writing, there is no way to isolate/conditionally enable or disable any of this functionality.

Per T249366: [Spike] What should we instrument in the new Vue.js search experience?, we're resolved to use the SearchSatisfaction schema to gather data for this A/B test (and afterwards?). As I mentioned above, searchSatisfaction.js contains modules that deal with session management and sampling and bucketing the user. This leaves us with one of three adventures to choose from:

  1. Extract the parts of searchSatisfaction.js that deal with sampling and bucketing the user, and add a mechanism to enable the instrument by a either Discovery or a third-party
  2. Only deliver searchSatisfaction.js to logged-in users on the Wikipedias that we're testing on and set the sample size to 100%
  3. Write and deliver our own instrument

I immediately discarded option three because we shouldn't throw away the accumulated knowledge and experience captured in searchSatisfaction.js. I believe that option two is optimal here because it only requires changes to ResourceLoader module definitions and a single configuration variable, i.e. update the WikimediaEvents extension to:

  • Only deliver searchSatisfaction.js to logged-in users on certain wikis, i.e.
WikimiediaEvents/extension.json
"ResourceModules": {
  "ext.wikimediaEvents.searchSatisfaction": {
    "skinScripts": {
      "vector": [ "ext.wikimediaEvents.searchSatisfaction/searchSatisfaction.js" ]
    }

    ....

  } 
}
WikimediaEvents/includes/WikimediaEventsHooks.php
public static function onBeforePageDisplay( OutputPage $out, Skin $skin ) {
  /* ... */

  if ( $out->getUser()->isLoggedIn() ) {
    $out->addModules( 'ext.wikimediaEvents.loggedin' );


    if ( self::getConfig( 'EnableSearchSatisfaction' ) ) {
      $out->addModules( 'ext.wikimediaEvents.searchSatisfaction' );
    }
  }
}

private static getConfig( string $name ): mixed {
  return MediaWikiServices::getConfigFactory()
    ->makeConfig( 'WikimediaEvents' )
    ->get( $name );
}
  • Enable searchSatisfaction.js everywhere it's delivered, i.e. set sampleSize.test to 100 here

Per the background, we're testing this change against logged-in users only. Therefore we can use the user's ID to bucket them, i.e. $bucket = $user->getID() % count( $BUCKETS ) and guarantee that the user will receive the same treatment for the duration of the test. This approach relies on interactions with the search widget being evenly distributed across logged-in users, which I think is a reasonable assumption (@MNeisler?).

Therefore we can use the user's ID to bucket them, i.e. $bucket = $user->getID() % count( $BUCKETS ) and guarantee that the user will receive the same treatment for the duration of the test. This approach relies on interactions with the search widget being evenly distributed across logged-in users, which I think is a reasonable assumption (@MNeisler?).

Yep this approach makes sense to me.

We skipped estimating this task during today's HOLD FOR READERS WEB meeting as @Jdlrobson is OoO.

Jdlrobson set the point value for this task to 5.Aug 27 2020, 5:34 PM

Should be low risk, but lots of moving parts. At least 2 patches in WikimediaEvents plus one in Vector.

Currently the search in header feature is controlled by a feature flag $wgVectorIsSearchInHeader . This would need to be changed to use SearchInHeaderLookup (like SkinVersionLookup). For logged in users, the A/B test would need to be applied based on User id using a mod operator.

This can be expressed using Vector's dead-simple feature manager:

includes/ServiceWiring.php
/* ... */

// Feature: Search in header
// =========================
//
// See https://phabricator.wikimedia.org/T259250 for additional detail.

$user = $context->getUser();

$featureManager->registerFeature(
  'SearchInHeader',
  [
    Constants::REQUIREMENT_FULLY_INITIALIZED,
    $user && $user->isRegistered(), //
    $user->getId() % 2 === 0
  ]
);
phuedx updated the task description. (Show Details)

Change 623644 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/skins/Vector@master] Use feature management for search in header

https://gerrit.wikimedia.org/r/623644

phuedx removed phuedx as the assignee of this task.Sep 7 2020, 3:38 PM

@phuedx left you some async feedback. Let's chat tomorrow (wed) morning.

Change 623644 merged by jenkins-bot:
[mediawiki/skins/Vector@master] Use feature management for search in header

https://gerrit.wikimedia.org/r/623644

@Jdlrobson can you please provide some test steps or direction for me on this. It sounded urgent in today's standup.

@Edtadros yeehhhhh, I'm not sure how to do this. Perhaps best thing to do is to enable it on office wiki and verify around 50% of us get the search in header and 50% don't. What do you think about that @ovasileva ? Is that enough testing for this one?

Option 2: We enable on beta cluster for logged in users and create a little confusion (since b behaviour will be different when going from anon to logged in) and test it there.

@Edtadros yeehhhhh, I'm not sure how to do this. Perhaps best thing to do is to enable it on office wiki and verify around 50% of us get the search in header and 50% don't. What do you think about that @ovasileva ? Is that enough testing for this one?

Option 2: We enable on beta cluster for logged in users and create a little confusion (since b behaviour will be different when going from anon to logged in) and test it there.

@Jdlrobson, @Edtadros - maybe testwiki would be better so we can look at it along with the instrumentation?

Jdlrobson added a subscriber: Edtadros.

Okay leave this with me I will enable on test wiki first.

Sam and Edward will QA this locally as part of T256100 cc @phuedx

@Edtadros and I went through the following scenarios:

  1. Set $wgVectorIsSearchInHeader to truthy
  2. Observe that the search widget is moved
  3. Unset $wgVectorIsSearchInHeader
  4. Set $wgVectorIsSearchInHeaderABTest to truthy
  5. Create a new user (User A) and log in
  6. Observe that the search widget has or hasn't moved
  7. Create a new user (User B) and log in
  8. Observe that the search widget is in a different position than in step 6

Test Result - Prod

Status: ✅ PASS
Environment: hewiki
OS: macOS Catalina
Browser: Chrome
Device: MBP
Emulated Device:NA

Test Artifact(s):

QA Steps

✅ AC1 - Used an existing user in hewiki, search widget was in new position

Screen Shot 2020-10-08 at 6.57.23 AM.png (128×1 px, 34 KB)

✅ AC2 - Created a new user in hewiki, search widget was in old position

Screen Shot 2020-10-08 at 6.56.21 AM.png (128×1 px, 23 KB)

@ovasileva @phuedx

I ran a check of the data recorded in SearchSatisfaction. Here is a breakdown of the distinct sessions and events by search location since deployment.

Overall

Search LocationUnique SessionsTotal Events
header-moved10689728582975
header-navigation5336813776086

By Wiki

Search LocationWikiUnique SessionsTotal Events
header-movedeuwiki10797106081
header-navigationeuwiki454638727
header-movedfawiki40605308727
header-navigationfawiki18341123445
header-movedfrwiki9387227725106
header-navigationfrwiki4658893402279
header-movedfrwiktionary58021408092
header-navigationfrwiktionary33750192849
header-movedhewiki2072034263
header-navigationhewiki1107718435
header-movedptwikiversity107706
header-navigationptwikiversity78351

Data appears as expected; however, there are a little over twice the number of sessions recorded with having the search widget in the new position ("header-moved"). This is because the data includes events from both logged-in and logged-out users. All logged-out users see 'header-moved' by default. In order to confirm the buckets are balanced for the AB test, we need to find a way to isolate events to just those in the AB test (or logged in users). Unfortunately, I'm not seeing any way to do that with the current instrumentation. It looks like events are currently recorded the same for logged-in users in the AB test and logged-out users.

I'm investigating to see if it is possible to use the mwSessionIDto join with a separate schema/instrument that can be used to determine logged-in status but it doesn't look there are any schemas that fit that criteria right now. We may need to look into adding an isAnon field to the schema or use the subtest field so we can identify AB test users.

Data Via

SELECT
    event.inputLocation AS search_location,
    wiki,
    COUNT(DISTINCT(event.searchSessionId)) AS search_session,
    Count(*) AS events
FROM event.searchSatisfaction 
    WHERE year = 2020 AND ((month = 09 and day >= 28) OR month >= 10) 
-- only deployed on modern skin vector
    AND event.skinVersion = 'latest'
    AND event.skin = 'vector'
    AND event.inputLocation IN ('header-moved', 'header-navigation')
-- review test wikis where deployed
    AND wiki IN ('euwiki', 'frwiki', 'hewiki', 'ptwikiversity', 'frwiktionary', 'fawiki')
    AND event.action = 'searchResultPage'
    AND event.source = 'autocomplete' 
GROUP BY 
    event.inputLocation,
    wiki

Change 634230 had a related patch set uploaded (by Phuedx; owner: Phuedx):
[schemas/event/secondary@master] SearchSatisfaction: Add isAnon field

https://gerrit.wikimedia.org/r/634230

Change 634231 had a related patch set uploaded (by Phuedx; owner: Phuedx):
[mediawiki/extensions/WikimediaEvents@master] SearchSatisfaction: Set isAnon field

https://gerrit.wikimedia.org/r/634231

Change 634230 merged by Bearloga:
[schemas/event/secondary@master] SearchSatisfaction: Add isAnon field

https://gerrit.wikimedia.org/r/634230

Change 634231 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] SearchSatisfaction: Set isAnon field

https://gerrit.wikimedia.org/r/634231

Change 635030 had a related patch set uploaded (by Phuedx; owner: Phuedx):
[mediawiki/extensions/WikimediaEvents@wmf/1.36.0-wmf.13] SearchSatisfaction: Set isAnon field

https://gerrit.wikimedia.org/r/635030

I've scheduled the above for deployment during today's European mid-day backport window.

Change 635030 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@wmf/1.36.0-wmf.13] SearchSatisfaction: Set isAnon field

https://gerrit.wikimedia.org/r/635030

Mentioned in SAL (#wikimedia-operations) [2020-10-20T11:35:02Z] <lucaswerkmeister-wmde@deploy1001> Synchronized php-1.36.0-wmf.13/extensions/WikimediaEvents/: Backport: [[gerrit:635030|SearchSatisfaction: Set isAnon field (T259250)]] (duration: 00m 57s)

I reran a check of the data following the addition of the isAnon field to the SearchSatisfaction Schema and confirmed we are now recording logged in status for all the search events.

@ovasileva - The data for the AB test now looks good and we should be set to run for 2 weeks. Based on the deployment date of the isAnon field, that would mean we should run the test through at least 3 November 2020. See details/notes below and let me know if you have any questions.

Here are the number distinct search sessions recorded for those in the AB test (logged-in users on partner wikis) from 20 October (when the isAnon field was added) through 23 October.

Distinct Search Sessions and Events for Users in Search Move AB Test

search_locationwikinum_eventsnum_sessions
header-movedeuwiki964194
header-navigationeuwiki610140
header-movedfawiki3192535
header-navigationfawiki2877508
header-movedfrwiki263864944
header-navigationfrwiki315025793
header-movedfrwiktionary2658459
header-navigationfrwiktionary1946351
header-movedhewiki702272
header-navigationhewiki503281
header-movedptwikiversity41
header-navigationptwikiversity62

The buckets appear to be balanced for each wiki (The current differences are within the probable range of a random 50/50 split). Note: Portuguese Wikiversity currently has very few search events from logged in users. I'll recheck incoming events over the next few weeks but it will likely be difficult to determine the effect of the search header move on this wiki unless we get a bigger sample size.

@ovasileva - Per our discussions, marking this as resolved but let me know if you have any questions and I can reopen.