Page MenuHomePhabricator

Instrumentation Data-QA Checklist for SearchSatisfaction schema changes
Closed, ResolvedPublic

Description

Instrumentation Data QA Checklist
(Please see Instrumentation Data-QA checklist Guidelines for an explanation of each item)

Steps

  • Unit Testing : Engineer
  • Pre-Deployment Instrumentation testing on Local Dev Environment : QA Tester

Refer T256100#6462200

Removing the following steps since we cannot test here : Refer T256100#6462200
[] Pre-Deployment Instrumentation testing on Beta Cluster : QA Tester
[] Pre-Deployment Instrumentation testing on Test wiki [OPTIONAL] : QA Tester
  • Post-Deployment Instrumentation testing on Production (target/test) wiki : QA Tester
  • Post-Deployment Data QA on Production (target/test) wiki data : Data Analyst

Event Timeline

ovasileva triaged this task as Medium priority.Sep 8 2020, 5:45 PM
ovasileva raised the priority of this task from Medium to High.
Mayakp.wiki updated the task description. (Show Details)Sep 9 2020, 5:58 PM

Created QC document, updated link in description : Search Feature QC : Instrumentation QA and data checks

Next Up: scheduled meeting with Web and PA on 9/15 to go over instrumentation changes and QA scenarios (that need to be added to the above document).

09/15 Meeting Notes: SearchSatisfaction Instrumentation & Data-QA

Instrumentation that is being QA-ed : 1. New skin version field and 2. A/B test of search widget location
A/B test will be running on all the test wikis where new skin (ie version 2) is deployed.
inputLocation : is the schema field which will have the old and new locations of the search widget
Values -
“Old” : Current location of search widget
“Header” : New location of widget (near the header)

A/B test QA:
Per Sam, instrumentation engineering will be required for QAing both buckets and testing this feature, which in turn will require more time. Hence, we have decided to test this in a local dev environment T256100#6462200
@phuedx will help @Edtadros on getting it installed and configured so that Edward can complete testing by EOW 09/18.

Documentation:
All scenarios that will be tested and environments are updated in the QC document.
Post deployment: @Edtadros will test instrumentation on Any two test wikis where skin version 2 is deployed

Please note that: Schema is ported to the Event Platform system as /analytics/legacy/searchsatisfaction

Mayakp.wiki updated the task description. (Show Details)Sep 15 2020, 6:20 PM

I updated the scenarios in the QC document and the instrumentation spec outlined in T256100's task description based on @phuedx updates re the instrumentation changes in T256100#6466896.

Here are a handful of events that I've seen during the development of T256100: Add skin version and search version fields to search satisfaction schema:

[0]
{
  "action": "searchResultPage",
  "source": "autocomplete",
  "searchSessionId": "07bce60f53cb4e158d28kf5n8mbe",
  "pageViewId": "c6a588025740c4701d49kf5nflk4",
  "scroll": false,
  "mwSessionId": "9e5dcb89f55ae504f447",
  "uniqueId": "693b44218cf1d4bce3ddkf5nfmiy",
  "sampleMultiplier": 0.01,
  "articleId": 1,
  "skin": "vector",
  "skinVersion": "legacy",
  "hitsReturned": 0,
  "query": "Test",
  "inputLocation": "header-navigation",
  "autocompleteType": "unknown"
}
[1]
{
  "action": "searchResultPage",
  "source": "autocomplete",
  "searchSessionId": "07bce60f53cb4e158d28kf5n8mbe",
  "pageViewId": "4113fb97ad1ce521f730kf5np7ba",
  "scroll": false,
  "mwSessionId": "9e5dcb89f55ae504f447",
  "uniqueId": "7279eb7b1b3173876693kf5np7sx",
  "articleId": 1,
  "skin": "vector",
  "skinVersion": "latest",
  "hitsReturned": 0,
  "query": "Test",
  "inputLocation": "header-navigation",
  "autocompleteType": "unknown",
  "msToDisplayResults": 22
}

Thanks @phuedx for the sample events. It would be great to see some events with new "inputLocation": "header-moved", but since @Edtadros is doing a local test we can look at it from his test results.

Mayakp.wiki updated the task description. (Show Details)Sep 17 2020, 9:59 PM
Edtadros reassigned this task from Edtadros to MNeisler.Mon, Oct 12, 1:25 AM

@MNeisler Please confirm the data in the checklist is sufficient.

MNeisler added a comment.EditedMon, Oct 19, 9:57 PM

@Edtadros - Thanks for completing the checklist!

@ovasileva @phuedx I completed an initial post-deployment QA of the changes to the SearchSatisfaction Schema. See details below and let me know if you have any questions or if there are any additional breakdowns that would help clarify. See notebook for additional details re calculation.

IDENTIFIED BUGS

  • There are a significant number of inputLocation = 'header-moved' events recorded with a skinVersion = 'legacy' on all wikis but we should only see new search location events for the latest skin version.
  • There are also some 'header-moved' and 'header-navigation' events being recorded without an associated skinVersion or skin. See details below:

Total Search Location Event and Sessions by Vector Version and Skin Type on All Wikis

search_locationvector_versionskin_typenum_eventsnum_sessions
header-movedlatestvector220980612739324
header-movedlegacyvector22375243031387422
header-movedNULLNULL2956516
header-navigationlatestvector3914585561534
header-navigationlegacyvector14829669421342907
header-navigationNULLNULL2503470

Daily Legacy Skin Search Sessions by Search Location Across all Wikis

Daily Latest Skin Search Sessions by Search Location Across all Wikis

We start recording 'header-moved' (new search location) events both on legacy and latest skins on the date of deployment (28 September). 'Header-navigation' (old search) location events decrease around the same time. This trend was expected on the latest skin but not on legacy where the new search location was not deployed. Interestingly, on October 14th, the number of header-moved events recorded on the legacy skin decreased (along with a corresponding increase in header-navigation). This did not occur on the latest skin.

Daily NULL Skin Search Sessions by Search Location Across all Wikis
Search events and sessions recorded as having a NULL vector skin version have been decreasing since deployment. Based on the trend lines show in the chart below, it looks like these NULL values might be related to caching issues as the new instrumentation was deployed. We're currently recording less than 11 NULL sessions per day and they seem to be decreasing still so I don't think this is an issue.

PASSED CHECKS

  • Confirmed that the two new search-header inputLocation types ('header-navigation' and 'header-moved') are only recorded for the vector skin type.
  • The 'header-moved' and 'header-navigation' events are only recorded with action='SearchResultPage' and source='autocomplete' events as expected.
  • There are 4 to 5 times the number of 'header-moved' events compared to 'header-navigation' events on the test wikis. This difference seems expected as the new header was also shown as default 50% of all logged-in users on test wikis in addition to all logged out users. Once the isAnon field is added, I will review the number of logged out and logged in user events to confirm the numbers of sessions and events per group seem accurate.

IDENTIFIED BUGS

  • There are a significant number of inputLocation = 'header-moved' events recorded with a skinVersion = 'legacy' on all wikis but we should only see new search location events for the latest skin version.

...

We start recording 'header-moved' (new search location) events both on legacy and latest skins on the date of deployment (28 September). 'Header-navigation' (old search) location events decrease around the same time. This trend was expected on the latest skin but not on legacy where the new search location was not deployed. Interestingly, on October 14th, the number of header-moved events recorded on the legacy skin decreased (along with a corresponding increase in header-navigation). This did not occur on the latest skin.

Unfortunately, this is understandable.

@Edtadros spotted this bug during his QA of T256100: Add skin version and search version fields to search satisfaction schema (see https://phabricator.wikimedia.org/T256100#6499464 onwards). The bug only affects the instrumentation, i.e. while inputLocation=header-moved when skinVersion=legacy, the search widget hasn't been moved. The bug was fixed soon after it was spotted but not before the initial instrumentation changes had been deployed. At the time, we thought it acceptable not to deploy the fix manually and simply let it ride the train. Last week, however, the train didn't leave the station until late on Wednesday, 13th October (see https://lists.wikimedia.org/pipermail/wikitech-l/2020-October/093958.html), which seems to match what you're seeing.

If you constrain your analysis to events logged on or after Thursday, 14th October, does the bug disappear?

MNeisler added a comment.EditedThu, Oct 22, 6:06 PM

@Edtadros spotted this bug during his QA of T256100: Add skin version and search version fields to search satisfaction schema (see https://phabricator.wikimedia.org/T256100#6499464 onwards). The bug only affects the instrumentation, i.e. while inputLocation=header-moved when skinVersion=legacy, the search widget hasn't been moved. The bug was fixed soon after it was spotted but not before the initial instrumentation changes had been deployed. At the time, we thought it acceptable not to deploy the fix manually and simply let it ride the train. Last week, however, the train didn't leave the station until late on Wednesday, 13th October (see https://lists.wikimedia.org/pipermail/wikitech-l/2020-October/093958.html), which seems to match what you're seeing.

Thanks for clarifying @phuedx. There was a sharp decrease in header-moved events on legacy on October 14th which fits with this timeline.

If you constrain your analysis to events logged on or after Thursday, 14th October, does the bug disappear?

I reran the analysis looking at only events logged on or after Thursday, 14th October. There are still header-moved events on legacy being recorded but it has declined significantly (by about 97% since the bug fix on October 13th). Events are still declining daily so it looks like the bug fix has addressed this issue. See current events and sessions below:

datevector_versionsearch_locationnum_eventsnum_sessions
2020-10-13legacyheader-moved161173402281637
2020-10-14legacyheader-moved5396029901143
2020-10-15legacyheader-moved3424298596049
2020-10-16legacyheader-moved1799972327441
2020-10-17legacyheader-moved984701182820
2020-10-18legacyheader-moved811087154778
2020-10-19legacyheader-moved829026161735
2020-10-20legacyheader-moved643613126212
2020-10-21legacyheader-moved41897382357
2020-10-22legacyheader-moved15358831965
MNeisler closed this task as Resolved.Mon, Oct 26, 6:48 PM

I completed the post-deployment QA checks and confirmed that data appears as expected.

See details in the notebook and a few observations below. I've also updated the QA doc with the scenarios I checked. @ovasileva - Per our discussions, signing this off but feel free to reach out if you have any questions.

  • As mentioned in T262300#6572417, the number of sessions recorded for each search location type and vector appears as expected following the Oct 13 bug fix. The number of 'header-moved' search events on legacy continues to decline.
  • I reviewed the number of search sessions and events by logged in status following the addition of the isAnon field on Oct 20th. Here is a breakdown by vector version and search location for users across all wikis:
vector_versionsearch_locationlogged_in_statusnum_eventsnum_sessions
latestheader-movedlogged-in90,58616,836
latestheader-movedlogged-out7,376,053925,453
latestheader-navigationlogged-in69,63913,154
latestheader-navigationlogged-out13622
legacyheader-movedlogged-in332
legacyheader-movedlogged-out1,048,688208,045
legacyheader-navigationlogged-in2,697,490505,192
legacyheader-navigationlogged-out104,985,73214,152,130
  • The differences between search location sessions for logged out vs logged in users appear as expected.
    • Most logged-out sessions on the latest vector skin are 'header-moved' events, as the new location is available by default for anonymous users on our early adopter wikis.
    • The number of 'header-moved' and 'header-navigation' events for logged in users on the latest vector skin are closer in number, as the new location was deployed to 50% of users on the test wikis.
    • There are 13% more search events on the legacy skin compared to the latest skin as the latest skin is available as a user preference on all wikis except for partner wikis.
    • There are still a few header-moved events on legacy recorded for both logged in and logged out users. These are likely related to caching/old, long-running sessions. There are very few of these for logged in users compared to logged out and the number is declining daily.