Investigate SearchSatisfaction mismatched test buckets
Closed, ResolvedPublic13 Estimated Story Points
Actions

Assigned To

Authored By

	EBernhardson
	Sep 14 2020, 6:19 PM

Description

As a consumer of AB test results I want accurate results so i can make sound decisions

Looking at data from the commonswiki mediasearch ab test, from 2019-09-10T16:00 through 17:00, there are thirteen events where the frontend logged one bucket, but the backend logging recorded a different bucket. Not sure if we've looked at this specifically before, joining frontend and backend logs and comparing recorded buckets. If frontend and backend don't agree on test buckets the data will be less reliable, and it will generally cause the stats to tend towards the same values in separate buckets.

Example bad request:

search_id: 163dliqu8lj2hpsgn9cvrbdwo
mediawiki_cirrussearch_request logged http params: cirrusUserTesting=control
event.SearchSatisfaction logged subTest: mediasearch_commons_int

This ticket is for the investigation and to create new tickets for the solution.

Details

	Subject	Repo	Branch	Lines +/-
	[searchSatisfaction] check validity of test buckets	mediawiki/extensions/WikimediaEvents	master	+58 -14
	Send enabled tests to the frontend	mediawiki/extensions/CirrusSearch	master	+15 -2

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	dcausse	T262845 Investigate SearchSatisfaction mismatched test buckets
Resolved	dcausse	T265374 AdvancedSearch should end the current request when redirecting the namespaced search URL
Resolved	dcausse	T265455 SearchSatisfaction instrumentation should cleanup the search URL
Resolved	dcausse	T266027 Test perfield_builder on spaceless languages

Event Timeline

EBernhardson created this task.Sep 14 2020, 6:19 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 14 2020, 6:19 PM

CBogen mentioned this in T262612: Run an A/B test using suggestions generated using glent Method 1.Sep 21 2020, 11:02 PM

CBogen moved this task from needs triage to Current work on the Discovery-Search board.Sep 28 2020, 3:38 PM

CBogen edited projects, added Discovery-Search (Current work); removed Discovery-Search.

CBogen updated the task description. (Show Details)Sep 28 2020, 5:19 PM

CBogen set the point value for this task to 13.

CBogen moved this task from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.Sep 28 2020, 5:21 PM

dcausse claimed this task.Oct 6 2020, 9:34 AM

dcausse moved this task from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.

For 691588 backend events matching a test bucket:

437764 match a SearchSatisfaction searchResultPage event
7204 are inconsistent with their corresponding SearchSatisfaction searchResultPage event (joining on the search token)
246979 have no matching SearchSatisfaction searchResultPage event, only 10 are matching go, rest is unclear

I think there might few reasons of the mistmatch/non matching frontend logs

User clicks a search link that has a cirrusUserTesting=bucket attached to it, 755 of these links are found on wiki: https://global-search.toolforge.org/?q=cirrusUserTesting&namespaces=&title . We might want to cleanup the search url so that users do not paste them somewhere else
User refresh/reopen a search tab and the search satisfaction session they were previously in has expired but this tab is on a search link with a cirrusUserTesting=bucket URL

But I doubt these reasons could explain the missing 246979 frontend events.

dcausse added a subtask: T265455: SearchSatisfaction instrumentation should cleanup the search URL.Oct 14 2020, 8:56 AM

Change 634308 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/CirrusSearch@master] Send enabled tests to the frontend

https://gerrit.wikimedia.org/r/634308

Change 634309 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/WikimediaEvents@master] [searchSatisfaction] check validity of test buckets

https://gerrit.wikimedia.org/r/634309

Change 634308 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Send enabled tests to the frontend

https://gerrit.wikimedia.org/r/634308

ReleaseTaggerBot added a project: MW-1.36-notes (1.36.0-wmf.14; 2020-10-20).Oct 15 2020, 10:00 PM

dcausse mentioned this in T266027: Test perfield_builder on spaceless languages.Oct 20 2020, 2:50 PM

The 246979 non-matching events are likely due to T265374
For the 7204 I could only find these two explanations:

User clicks a search link that has a cirrusUserTesting=bucket attached to it
User reopen its browser with several tabs opened one of which has link with a cirrusUserTesting=bucket param attached to it

Mitigation is to avoid keeping the cirrusUserTesting=bucket param in the location bar of users' browser: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/634184/.
But also try to detect the mismatch from the frontend code and avoid sending broken events: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/634309/.
Salvaging such sessions seems difficult so if the number of mismatch/invalid sessions stays below a certain threshold (<1%) we believe that it is acceptable and won't penalize future A/B tests (if such sessions are properly identified as such).