Page MenuHomePhabricator

Create Superset dashboard for search metrics
Open, Needs TriagePublic

Description

To simplify access to Search metrics and ease the exploration of metrics, we need to have minimal dashboards in Superset. Those dashboards will expose similar information than what is available via the current jupyter notebooks.

Details

Other Assignee
dr0ptp4kt
TitleReferenceAuthorSource BranchDest Branch
search: Chgrp search metrics to analytics-privatedata-usersrepos/data-engineering/airflow-dags!703ebernhardsonwork/ebernhardson/metrics-permissionsmain
Customize query in GitLab

Event Timeline

dr0ptp4kt updated Other Assignee, added: dr0ptp4kt.
dr0ptp4kt subscribed.

I'm to have "3 important metrics" set for Erik's dashboarding.

@EBernhardson,

Here is what I propose for dashboarding. I've put this into "3 important metrics areas". In addition to expressing the ratios, provide the raw counts used for those ratios.

If this is too much and it's more realistic to focus on a smaller set of very specific items instead, please see the final section where I indicate the suggested shortlist.

Note that while all of the items listed in T358345#9710964 should be ascertainable from forthcoming Discolytics pipelines ( https://gitlab.wikimedia.org/repos/search-platform/discolytics/-/merge_requests/33 ) and could thus be available from dashboards, not all of them are requested for dashboarding. Rather, the items here are zooming in for the most relevant information, and for the cases where search-targeted interventions could improve the user experience. We can always add more dashboarded material if needed later. Note in particular:

  • Part 1 measures for both mobile web and desktop
  • Part 2 measures for just desktop, where search is much more prominent and instrumentation is more fully understood
  • Part 3 measures for just mobile

Part 1: By access method, AUTOCOMPLETE. 30 day time series for the following.

num_actors_w_autocomplete_pv / num_actors_w_pageviews

Objective: we want to know the degree to which autocomplete is affecting the experience of users. We know on mobile web that typing is harder, and so any improvements in autocomplete can make a big difference to quality of life, and that should show up in the mobile web ratio. We know on desktop web that users are much more likely to type a title or redirect title sufficiently to have meaningful Go results, however, better autocomplete can still make a decent difference in the quality of life, and that should show up in the desktop web ratio.

Note 1: Autocomplete does get incorporated to some extent in Part 2 already (which is focused on desktop), but here we're trying to focus here on the first point of contact for most searchers. Heuristic / algorithm / orchestration / backend search strategy improvements can influence things in both fulltext SERPs (Part 2) and, when feasible and used, in autocomplete.

Note 2: We know that an intervention to draw attention to the search bar could cause a fluctuation in these ratios. Originally I had thought to use the equivalent of num_actors_w_autocomplete_pv / num_actors_w_pageviews, the thinking being that it would be more straightforward to tell if the utility of autocomplete had gone up under normal circumstances. That is what we've gone with. As the dashboard solidified, I removed the request for num_autocomplete_pv / num_actors_w_autocomplete_pv; this can be added to a dashboard later if we'd like to see how autocomplete utility fluctuates (if it doesn't watch out due to other macro factors).

Part 2: Desktop web, SEARCH-based traffic. A 30 day time series for the following:

Desktop web: search_pv / num_daily_pageviews

Desktop web: search_pv_actors / num_daily_actors_w_internal_pv

Desktop web: fulltext_abandon (num_sessions_w_fulltext_abandon / num_sessions_w_fulltext_serp)

Objective: we want to know if there are fluctuations in search on desktop. Search is an easier and more natural experience for a range of information seeking on desktop, and there are some longstanding differences in treatment and measurement, as well as instrumentation drift, that can confound interpretability of mobile web data.

If you think it's trivial to add search_pv / num_daily_pageviews and search_pv_actors / num_daily_actors_w_internal_pv for mobile web, feel free to do so, but understood if not, and also the fulltext abandonment is too tricky right now on mobile web.

Note 1: we know there's less search on mobile web, and external search engine access is very pronounced as the natural way to information seek, but that is beside the point here. Keep in mind Part 1 and Part 3 are likely the main optimization targets for improving productive mobile web searches.

Note 2: the treatment on Special:Search can result in title hovers that meet the user's need and could be reasonably construed as non-abandonment but we choose here to ignore those impressions and consider abandonment as a composite for for relevant (and sufficiently complete) SERP material.

Part 3: Mobile web, RELATED ARTICLES. A 30 day time series for the following:

related_articles_pv / related_articles

related_articles_pv_actors / num_daily_actors_w_internal_pv

Objective: we want to know if there are fluctuations in Related Articles usage. This will be helpful for identifying major swings due to factors outside of the API's control (corpora composition, UX changes, at-large content consumption patterns), as well as any changes in the way that Related Articles results are formed or UX changes pertaining to Related Articles results.

Note 1: the denominator for the first measure is number of morelike fetches, a rough proxy for possible users of the feature. This somewhat helps to shield measurement from the impact of factors such as intro section length and number of sections, and which play into the amount of scrolling required to likely see a Related Article result set.

Note 2: the actor-based calculation for the second measure gives us a rough idea of the feature's usage for navigating the site.

If the requested metrics areas above are too tall an order for the next few work days for dashboarding, then the following should be the target for the minimum to dashboard with 30 day time series in my view:

  1. Mobile web: is_minerva_autocomplete_pv / num_actors_w_pageviews (on mobile web)
  2. Desktop web: is_desktop_autocomplete_pv / num_actors_w_pageviews (on desktop web)
  3. Desktop web: fulltext_abandon (num_sessions_w_fulltext_abandon / num_sessions_w_fulltext_serp)
  4. Mobile web: related_articles_pv / related_articles

CC @Gehel.

Went through and made some test charts in superset against my test tables I generated with the live data. It looks like we have everything we need, but I'm going to make one change to the collection scripts to simplify things.

Specifically joining the two datasets (needed to put part 2 into a single chart) is a little annoying, because they don't have the exact same dimensions. Some are resolvable, we have browser_family only on the satisfaction metrics, will add that to the webrequest metrics to align them. Satisfaction also has user_edit_bucket, which is not available in web requests. Since it's not used on any of the dashboards i'm going to ignore that column. browser_family isn't used either, but is an easy enough one to fix and may make future work easier.

I'm aiming to deploy the discolytics and airlfow-dags patches after that today so we can start putting together the final dashboards.

This is now marked as a Published Superset dashboard at https://superset.wikimedia.org/superset/dashboard/search . These dashboards are internally accessible.

@EBernhardson I was showing this today and we realized one more thing - would you please adjust the permissions for the underlying HDFS path to the data so that the analytics-privatedata-users group has read permissions? This should then make the dashboard viewable to all users on the cluster instead of just the analytics-search-users group members.

@EBernhardson I was showing this today and we realized one more thing - would you please adjust the permissions for the underlying HDFS path to the data so that the analytics-privatedata-users group has read permissions? This should then make the dashboard viewable to all users on the cluster instead of just the analytics-search-users group members.

Should be done now. I've manually changed the permissions of the existing data, and updated airflow so it will set the expected ownership after creating the daily rollups.