Page MenuHomePhabricator

SearchSatisfaction schema has dead code for logging WMVI in extraParams event - should it be fixed or removed?
Closed, ResolvedPublic

Description

The SearchSatisfaction previously logged "WVUI" inside the extraParams field if the Vector search was built via WVUI. Presumably this was to distinguish it from the default jQuery search box and compare results for both.

This behaviour was broken with the migration from WVUI to Codex. Now, the extraParams field has no distinction for Codex.

This is presumably okay, as it should be impossible to get anything but the Codex search input inside the Vector 2022 skin and for Vector legacy it will always be the non-Codex search.

We should therefore remove this code.

Event Timeline

Jdlrobson renamed this task from Is WikimediaEvents extension still used? to SearchSatisfaction schema has dead code for logging WMVI in extraParams event - should it be fixed or removed?.Sep 16 2022, 6:36 PM
Jdlrobson updated the task description. (Show Details)
Jdlrobson added subscribers: jwang, ovasileva, MNeisler.

FYI @jwang @ovasileva @MNeisler the extraArgs field in the SearchSatisfaction schema was previously used for distinguishing old search implementation VS new search ? I am pretty sure this is redundant now and can be removed but pinging you as a courtesy.

FYI @jwang @ovasileva @MNeisler the extraArgs field in the SearchSatisfaction schema was previously used for distinguishing old search implementation VS new search ? I am pretty sure this is redundant now and can be removed but pinging you as a courtesy.

This seems fine.

Change 831148 had a related patch set uploaded (by Phuedx; author: VolkerE):

[mediawiki/extensions/WikimediaEvents@master] SearchSatisfaction: Remove extraArgs logging for WVUI

https://gerrit.wikimedia.org/r/831148

Change 831148 merged by jenkins-bot:

[mediawiki/extensions/WikimediaEvents@master] SearchSatisfaction: Remove extraArgs logging for WVUI

https://gerrit.wikimedia.org/r/831148

@Jdlrobson , thanks for pinging me.

I do not see extraArgs field in the SearchSatisfaction schema is used anywhere. So it seems fine.

In addition, here is the summary data of Sept 2022 from SearchSatisfaction schema . I don't knew whether it is supposed to only have old-search-location(event.inputlocation=header-navigation) events in the schema. Please confirm and let me know if anything else is odd to you.

num_sessionsinputlocationsearch_locationlogged_in_statuswiki
1header-navigationold-search-locationlogged-outitwiki
2header-navigationold-search-locationlogged-infrwikiversity
7header-navigationold-search-locationlogged-outenwiki
2header-navigationold-search-locationlogged-outfrwikiversity
1header-navigationold-search-locationlogged-outcswiki
3header-navigationold-search-locationlogged-outdewiki
1header-navigationold-search-locationlogged-outsvwiki
1header-navigationold-search-locationlogged-outplwiki
6header-navigationold-search-locationlogged-ineswiki
1header-navigationold-search-locationlogged-outdawiki
2header-navigationold-search-locationlogged-outeswiki

Query

SELECT COUNT (DISTINCT event.searchSessionId) AS num_sessions,
   event.inputLocation,
   If(event.inputLocation = 'header-moved', 'new-search-location', 'old-search-location') AS search_location,
   If(event.isAnon = true, 'logged-out', 'logged-in') AS logged_in_status,
   wiki AS wiki
FROM event.searchSatisfaction
WHERE year=2022 and month=9
AND event.action = 'searchResultPage'
   AND event.source = 'autocomplete'
   AND event.inputLocation IN ('header-moved', 'header-navigation')
   AND event.skinVersion = 'latest'
   AND event.skin = 'vector'
   AND useragent.is_bot = false
GROUP BY
   event.inputLocation,
   event.isAnon,
   wiki
jwang triaged this task as High priority.Oct 4 2022, 12:24 AM
jwang moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

The query in T317715#8281408 might be outdated ( worked in 2021-08). Should we use vector-2022 as the event.skin keyword now?

Replace

event.skinVersion = 'latest'
   AND event.skin = 'vector'

with

event.skinVersion = 'latest'
 AND event.skin = 'vector-2022'

Below is the number of events after the update. Does the number make sense now?

num_sessionsinputlocationsearch_locationlogged_in_status
393881header-movednew-search-locationlogged-in
7954428header-movednew-search-locationlogged-out

Query to get the data

SELECT COUNT (DISTINCT event.searchSessionId) AS num_sessions,
   event.inputLocation,
   If(event.inputLocation = 'header-moved', 'new-search-location', 'old-search-location') AS search_location,
   If(event.isAnon = true, 'logged-out', 'logged-in') AS logged_in_status
FROM event.searchSatisfaction
WHERE year=2022 and month=9
AND event.action = 'searchResultPage'
   AND event.source = 'autocomplete'
   AND event.inputLocation IN ('header-moved', 'header-navigation')
   AND event.skinVersion = 'latest'
   AND event.skin = 'vector-2022'
   AND useragent.is_bot = false
GROUP BY
   event.inputLocation,
   event.isAnon

Yeh that looks much better. Presumably the results you were seeing in T317715#8281408 relate to cached pages or proxy sites. Thanks for checking again.