Page MenuHomePhabricator

Analytics about usage of search - Updated data for dashboard
Closed, ResolvedPublic

Description

Analytics questions in regards to search usage

01) Do we have any insights on where users are searching from in the app? There are three possible origins:
  1. Explore search bar below Wikipedia
  2. Explore search icon at top right
  3. Article search icon at top right

This is the only insight I have in regards to search usage so far: F31951319

It says that ~ 28% from all page views are coming from internal search, but not which one exactly. Dmitry pulled this data back in April 2019.

👉 This data will be used to revise the information architecture of the app → where do we show search?

02) What percentage of search queries end up with zero results?
03) Do empty results stand in direct correlation with having set multiple languages in the app?

Hypothesis: People are looking for something with the wrong language filter enabled. The more languages set, the more likely it it is to get no results.

A simple split for this data would be great. E.g. 1 app language set → 2% no results, 2 app languages set → 3% no results, 3 app languages set → 3% no results.

👉 Data from 02) and 03) will be used to revise the design for empty state.

04) How are links in the toolbar used on the article page → Which one is the most used? Which one the least? A percentage split for its usage is appreciated.

article-search-01-before.png (1×720 px, 1 MB)


Adding @SNowick_WMF’s email reply here for the sake of completion

We have a Schema that tracks users search within the app: https://meta.wikimedia.org/wiki/Schema:MobileWikiAppSearch that I can get data from. That has a source field: "The source from which the Search interface was invoked: 0 - Main article toolbar, 1 - Widget, 2 - Share intent, 3 - Process-text intent, 4 - Floating search bar in the feed, 5 - Voice search query."

Event Timeline

Hi @Dbrant there are values in Event Logging for invoke_source that are not listed in the MobileWikiAppSearch schema. Definitions for 0-6 are listed, the next value starts at 18 and the following all show up in query. Do these have any meaning and if so can we get definitions so I can add to the schema, thanks. (It looks like 19 and 23 are most frequent.)

18
19
20
21
22
23
24
25
26
27
28
29
30
31

SNowick_WMF renamed this task from Insights about usage of search to Analytics about usage of search.Aug 8 2020, 12:29 AM
Charlotte lowered the priority of this task from High to Medium.Aug 12 2020, 5:52 PM
Charlotte lowered the priority of this task from Medium to Low.Aug 12 2020, 5:58 PM
Charlotte lowered the priority of this task from Low to Lowest.Aug 27 2020, 6:44 PM

Hi @Dbrant there are values in Event Logging for invoke_source that are not listed in the MobileWikiAppSearch schema. Definitions for 0-6 are listed, the next value starts at 18 and the following all show up in query. Do these have any meaning and if so can we get definitions so I can add to the schema, thanks. (It looks like 19 and 23 are most frequent.)

@SNowick_WMF sorry for the delay on this! The invoke_source parameter used in the Search schema corresponds to this enumeration in our code (in numerical order)

This means that 19 refers to the TOOLBAR, i.e. the "search" button in the app's toolbar at the top, and 23 refers to the FEED_BAR, which is the "search" box at the top of the Explore feed.
We can also see that after our last release on Oct 1, searches from the toolbar (19) have been almost entirely replaced by searches from the bottom bar (31), also known as the page action bar. This is of course expected because that's precisely the design change that we made to the search interface.

image.png (397×848 px, 43 KB)

@JKatzWMF I believe you were curious about this, too. ^

Re: @Dbrant’s comment in T265766

Since our last release on Oct 1, usage of the Search feature has been declining, instead of increasing:

image.png (383×1 px, 44 KB)

image.png (342×958 px, 106 KB)

This implies that Search might no longer be as discoverable as it was previously. We should take our learnings from this, and decide what to do next asap.

I recommend to take a step back, give it more time and not rush with a decision to make sure we’re not confusing users. Let’s be honest with ourselves, an impact on the stats was to be expected. It's a significant change, people need to get used to it. Let's closely monitor the development and prepare for potential scenarios. I see this as great opportunity to learn and take actions.

Quick recap: What did we improve?
  • We improved consistency of the search by moving it to the bottom navigation and article toolbar. We changed iconography for Lists and History to further improve clarity.
  • We made searching more ergonomic by moving it from the top to the bottom of the app. The most used feature in the app is now also one of the physically easiest to reach.
  • We made the search more intelligent by including personalized results from My lists, History, Open tabs and other languages that are set in the app.
Suggested next steps
  • According to @SNowick_WMF, the trends are not alarming. We should look more deeply and broadly at stats to get a holistic view and draw the right conclusions (T266056)
    • @SNowick_WMF will perform a trend analysis for decline search start events and search by source to isolate the issue further and come up with recommendations
    • @SNowick_WMF will look at iOS stats to compare the data - how did their numbers develop during this period?
  • In general, we should give it more time, monitor it closely, review data for November and draw conclusions/take decisions. (end of November)
  • In the meantime (November), we should prepare different scenarios
    • Line up user onboarding → show users where the new search is positioned, e.g. with an educational tooltip for the bottom navigation and article page.
    • Usability test the new search, identify issues and solve them.
First conclusions from looking at the data
  • Less people are using search since the week of August 31. The decline started before the release of the new search in the week of September 28 (see Android Search Actions Weekly).
  • Comparing before/after release, the proportion of search queries via floating bar on Explore and search button(s) stayed the same. (see Start Search by Source) → this might indicate that the change to the bottom search did not have a negative impact
  • Since the release of the new version of the app, the rating in the app store has been better than in previous releases (+2.7%, see Google Play Console)
  • Article page views via Android app have increased after the release of the new version in the week of September 28 (see Turnilo)

Resolving this ticket - see https://phabricator.wikimedia.org/T266056 for trend/new change data analysis.

Reopening, per face-to-face with @SNowick_WMF . Thanks Shay for taking time today to discuss!

Here are the figures we're hoping to visualize from any refreshed dashboard, with an aim on having some things similar to the web-based view at https://superset.wikimedia.org/superset/dashboard/search , but with an understanding that the mechanics are different.

1a. Search-based pageview unique users as portion of all pageview unique users, daily, 30-day timeseries
1b. Search-based pageviews as portion of all pageviews, daily, 30-day timeseries

And if the data are available:

2a. Search abandonment rate, daily, 30-day timeseries. Denominator is search sessions, not unique users. A search session here would be defined as a user tapping on the search button and entering one or more characters, but then never getting to the point of tapping on a result from the search results within that search session.

3a. Read More-based pageview unique users as portion of all pageview unique users, daily, 30-day timeseries (this is from the context of a user reading an article and making it to the bottom of the article to use Read More; if they visit one of the Read More results that is a Read More-based pageview)
3b. Read More clickthrough rate, daily, 30-day timeseries. In other words, Read More-based pageviews divided Read More panel impressions.

The Search Platform team is planning to regularly review the dashboard for the web-oriented behavior, and would like to be able to do something similar for app-oriented behavior. Understood right now that the more extensive instrumentation is probably Android-focused at the moment.

CC @JTannerWMF for visibility.

SNowick_WMF renamed this task from Analytics about usage of search to Analytics about usage of search - Updated data for dashboard.May 24 2024, 7:55 PM

@dr0ptp4kt I have added the requested Daily chart to Android Search Dashboard for the following:

1a. Search-based pageview unique users as portion of all pageview unique users, daily,
1b. Search-based pageviews as portion of all pageviews, daily, 30-day timeseries

I added the [30-day timeseries] tracking similar to the table on your dashboard as well.

For the rest of the requested data, I will look into where/if we are tracking those events to surface data, will add here when that's ready. If this requires more instrumentation we can discuss with engineers.

Hi @dr0ptp4kt - I am still working on the abandonment rate queries, what I can measure is not exactly as your metric describes because we don't have a 'user enters some text' event (which would kick off autocomplete/suggest) - I can measure Search start events and search result clicks by unique session_ids.

Would it be possible to get editor access to your dashboards so I can see underlying queries for the SERP abandonment as it's being charted - from what I can see it's querying a derived dataset that may already have the values as you are showing but would be interested to see if I am matching how you are measuring.

For the metric below - we currently don't track Read More clicks as anything more than an internal link click - we would have to engineer tracking to attribute those clicks to Read More as source, I spoke with Dmitry about it briefly and it sounds like something that isn't a quick fix to plug in so if we want to measure it will need to get prioritized in the engineering queue.

3a. Read More-based pageview unique users as portion of all pageview unique users, daily, 30-day timeseries (this is from the context of a user reading an article and making it to the bottom of the article to use Read More; if they visit one of the Read More results that is a Read More-based pageview)
3b. Read More clickthrough rate, daily, 30-day timeseries. In other words, Read More-based pageviews divided Read More panel impressions.

Thanks @SNowick_WMF , just acknowledging receipt! Catching up on things, will circle back

The derived dataset the fulltext abandonment comes from is discovery.search_satisfaction_metrics. This should be readable by anyone in the privatedata group. The related code definition for fulltext abandonment is found in search_satisfaction_metrics.py. Essentially the recorded events indicate on a per-session basis how many result pages they saw, and how many pages they visited. If they saw a search result page and visited no pages we considered that abandonment.

There is also an autocomplete abandonment, calculated in the same script, but we don't have it on our reporting dashboard so i'm suspecting only the fulltext results are relevant here.

Hi @dr0ptp4kt - I am still working on the abandonment rate queries, what I can measure is not exactly as your metric describes because we don't have a 'user enters some text' event (which would kick off autocomplete/suggest) - I can measure Search start events and search result clicks by unique session_ids.

That would be good! A time series showing the daily average of 1 minus (clicks divided by start events) would be useful. Is that possible without too much extra effort?

Would it be possible to get editor access to your dashboards so I can see underlying queries for the SERP abandonment as it's being charted - from what I can see it's querying a derived dataset that may already have the values as you are showing but would be interested to see if I am matching how you are measuring.

Did the pointer from Erik provide the necessary detail? If you're looking for the original merge request introducing the pipeline, it's at https://gitlab.wikimedia.org/repos/search-platform/discolytics/-/merge_requests/33 . If you'd like to step through the relevant parts of the code, please feel free to schedule a meeting for a mutually available time (next week Tuesday / Thursday afternoons look okay).

I think sometimes Superset permissions may prevent display of certain things, so copy-pasting the actual query for the chart just in case here with the date range pre-filled:

SELECT date_trunc('day', CAST(CONCAT(CAST(year AS VARCHAR), '-', LPAD(CAST(month AS VARCHAR), 2, '0'), '-', LPAD(CAST(day AS VARCHAR), 2, '0')) AS TIMESTAMP)) AS "date",
       SUM(num_sessions_w_fulltext_abandon) / CAST(SUM(num_sessions_w_fulltext_serp) AS REAL) AS "Fulltext Abandonment"
FROM "discovery"."search_satisfaction_metrics"
WHERE CONCAT(CAST(year AS VARCHAR), '-', LPAD(CAST(month AS VARCHAR), 2, '0'), '-', LPAD(CAST(day AS VARCHAR), 2, '0')) >= '2024-05-13 00:00:00.000000'
  AND CONCAT(CAST(year AS VARCHAR), '-', LPAD(CAST(month AS VARCHAR), 2, '0'), '-', LPAD(CAST(day AS VARCHAR), 2, '0')) < '2024-06-13 00:00:00.000000'
  AND "access_method" IN ('desktop')
GROUP BY date_trunc('day', CAST(CONCAT(CAST(year AS VARCHAR), '-', LPAD(CAST(month AS VARCHAR), 2, '0'), '-', LPAD(CAST(day AS VARCHAR), 2, '0')) AS TIMESTAMP))
ORDER BY "Fulltext Abandonment" DESC
LIMIT 10000;

For the metric below - we currently don't track Read More clicks as anything more than an internal link click - we would have to engineer tracking to attribute those clicks to Read More as source, I spoke with Dmitry about it briefly and it sounds like something that isn't a quick fix to plug in so if we want to measure it will need to get prioritized in the engineering queue.

3a. Read More-based pageview unique users as portion of all pageview unique users, daily, 30-day timeseries (this is from the context of a user reading an article and making it to the bottom of the article to use Read More; if they visit one of the Read More results that is a Read More-based pageview)
3b. Read More clickthrough rate, daily, 30-day timeseries. In other words, Read More-based pageviews divided Read More panel impressions.

Let's skip extra instrumenting for now. Thank you for looking into this!

HI @dr0ptp4kt thanks for following up with all this extra info!

Dashboard now has the 1 minus search click through/search start rates for abandon rates by session and got some surprisingly low abandon rates (11.4% daily average for 2024-05). I added some additional info on dashboard, particularly that Android surfaces results from user open Tabs and Reading Lists so the search functionality is different than desktop.

I will make derived tables for this data to use for historical data charts since I am running into timeout errors when query includes more than 1 month daily events.

Let's set up a meeting to review these results next week.

Marking this as resolved, please re-open ticket with request if more data is needed.