Page MenuHomePhabricator

Android navigation refresh - understand impact on user engagement metrics
Closed, ResolvedPublic

Description

Based on user feedback from Google Play, we have identified several areas where users seem dissatisfied with the new look and feel - for example saying that they find it difficult to access content they are interested in, or find things like the Table of Contents unpleasant to interact with. We need to understand whether these reviews are simply negative reactions to novelty, or whether we can perceive a quantitative difference in user engagement.

We are looking for improvement/worsening in engagement metrics such as:

  • Daily average users (being able to access more content they want to read)
  • Average time spent in the app
  • Interactions with articles (eg. saved articles in lists, sharing)
  • Deeper user sessions (articles read per session, regardless of bouncing between feed or article rabbit-holing) with more switching between article reading and exploratory actions
  • If possible, usage/interaction with Table of Contents
  • Usage of other parts of the app (feed, nearby, reading lists, etc)

Ideally, we would like to compare per session data from users pre- and post-change (October 24th).

Event Timeline

Charlotte updated the task description. (Show Details)
Charlotte updated the task description. (Show Details)

Plan of action

I recommend the following key metrics:

  • DAU and stickiness % (DAU/MAU) for November 2018 with MoM & YoY for both
  • Average session length by category of user
    • bottom 10% of users by # of sessions that day (rare usage)
    • top 10% of users by # of sessions that day (very frequent usage)
  • Distribution of average # of articles read by user across their sessions

All of these would be calculated across all languages.

Here is an expanded list of possible metrics:

  1. DAU: we have this (already in App core metrics spreadsheet) and can easily calculate MoM & YoY for November 2018
  2. For average time spent in app, we have MobileWikiAppSessions EL which includes session length. I can calculate the following for production version specifically also I need a decision on which one or if to calculate all:
    • average of (average session length across a user's sessions) across all users
    • median session length across across all sessions
    • average session length across all sessions from users in the bottom 10% by # of sessions per day (users who use the app infrequently throughout the day, e.g. using it on their commute to & from work)
    • average session length across all sessions from users in the top 10% by # of sessions per day (users who use the app very frequently throughout the day, e.g. every break at work, several times during the evening)
  3. Interactions with articles:
    • For reading lists we have MobileWikiAppReadingLists which…doesn't actually help us learn anything useful :\ Recommendation: redesign the reading list analytics The following metrics will not be calculated:
      • absolute # of times users click to add an article to a new or existing list because because we don't know what % of opened articles that represents
      • absolute # of lists the users have is not a meaningful statistic in this context
    • For sharing we have MobileWikiAppShareAFact which tracks highlights & sharing and I'll *likely* be able to join that with MobileWikiAppLinkPreview which tracks when an articlew has been opened, so I *might* be able to calculate the following:
      • % of articles opened which user highlighted some text in
      • % of highlights that user initiated sharing (we can't track whether something was actually shared or not, just that the user tapped to share)
  4. Deeper user sessions, the same EL for session length also includes count of articles read per session so I can calculate:
    • distribution of average # of articles read per session across all sessions
    • distribution of average # of articles read by user across their sessions
  5. Interaction with Table of Contents
    • @Dbrant did you fix the funnel issues I reported when the update was initially released? If not, I'll make a ticket (what I should have done initially instead of the informal chat we had)

@Charlotte can you please review this and check that I've interpreted the questions correctly? If I did not, can you please explain what specifically you're looking for. Also, please let me know how you want me to go about the average time spent. I proposed a few potential metric definitions which formalize the question in different ways.

Thanks for this @mpopov. Answers to your particular questions:

Key metrics look spot on to me.

For the average time spent question, let's please look at the median session length across all + average session length for the top and bottom 10%. (I think average session across all will probably be too insensitive a gauge here.)

If you can suggest ways to redesign the reading list analytics (separate ticket for us) let's do that based on the research questions we memorialised in the design deck for that feature.

For sharing, the % of highlights where a user initiated sharing is fine - but do we not have tracking on the overflow menu item for sharing? (The % of articles where a user just highlighted some text is not super meaningful here.)

If you can suggest ways to redesign the reading list analytics (separate ticket for us) let's do that based on the research questions we memorialised in the design deck for that feature.

Totally, and as I mentioned in the standup it'll have to be part of an overall top-down overhaul.

For sharing, the % of highlights where a user initiated sharing is fine - but do we not have tracking on the overflow menu item for sharing? (The % of articles where a user just highlighted some text is not super meaningful here.)

We do not! I just checked and the only event that hitting the share button generates is a page scroll event (even though the page does not actually scroll) ¯\_(ツ)_/¯

For sharing, the % of highlights where a user initiated sharing is fine - but do we not have tracking on the overflow menu item for sharing? (The % of articles where a user just highlighted some text is not super meaningful here.)

We do not! I just checked and the only event that hitting the share button generates is a page scroll event (even though the page does not actually scroll) ¯\_(ツ)_/¯

Oh ffs. Well, another thing to include in the overhaul. :)

A huge chunk of my analysis got invalidated when Dmitry & I found out that the underlying data was faulty (T213190). Specifically, all of the analysis related to session length, number of sessions, number of pages read per session. Unfortunately the nature of the bug means that we won't be able to compare those metrics before & after the update.

However, some of the analysis is still (somewhat) valid.

Daily Users

According to our daily active users data (computed with this query), there was no change:

weekly_active_users.png (1×3 px, 239 KB)

daily_active_users.png (1×3 px, 797 KB)

(We've been losing active users for the past 2 years.)

HOWEVER, when I was looking into number of users in event logging (MobileWikiAppSessions) there was an increase following the update:

daily_users.png (2×3 px, 103 KB)

(The differences between beta vs production is that the sessions funnel in beta is NOT sampled, while there's a 1 in 100 sampling in production.)

Furthermore, at two metrics on the Google Play Store Console:

  • Installs on active devices - "Number of Android devices that have been online at least once in the past 30 days that have your app installed."
  • Installs on devices - "Number of devices that users install your app on for the first time."

active_installs.png (553×952 px, 78 KB)

What this is showing us is that the increase in number of users we saw isn't due to new installs but due to people who already had the app installed and were opening it when they weren't before. In fact, by the way the "installs on active devices" metric is defined it's like ~300K devices that weren't online in the 30+ days leading up to the nav update all came back online after the nav update was release. This is SO WILD that I reached out to Google Play Store Console support to check if there was any kind of unreported issue on their end or if the anomaly we were seeing was a natural phenomenon. They responded:

I'm happy to investigate your increase in installs further, though it can be tough to pinpoint the exact reason. There are a few things that could lead you to see an increase in installs, including:

  • A change in rank due to an algorithm update or normal change in ranking signals from your app and other apps.
  • An update to the app's description, icon, or other store listing details that may have changed your app's appeal.

It appears we have contradictory data. On the one hand, our daily/weekly active users did NOT change after the update. On the other hand, number of users in client-side analytics as well as Google Play Store analytics DID go up significantly. I have no idea what to make of this.

@Charlotte & @kzimmerman: seems we need to prioritize:

  • a thorough review of how we count unique users, and/or
  • a reworking of how we count unique users (T202664)

Table of Contents

Something I looked into was usage of the new table of contents, where usage is defined as: % of time when the ToC was displayed to the user that they used it. In both new and old ToC the user could tap on the section name to navigate to it, and in the new version they could scroll with the ToC widget to end up at the section they want. When I compare ToC usage before and after the update, something very suspicious shows up:

toc_usage.png (1×3 px, 117 KB)

Basically, we've gone from a pretty even distribution of usage to suddenly near-100% usage of ToC. When I showed Kate the feature, it was not immediately apparent to her how to close the ToC once it was open, so she tapped on the section she was already at to close the ToC. When I looked at user reviews for the app on the Google Play Store, a lot of the negative feedback was around how annoying the floating ToC widget was and how they kept opening up the ToC by accident. So our hypothesis is that a lot of users are opening the ToC by accident and not intuiting how to close it, so they just do what Kate did, therefore inflating the usage metric.

I strongly recommend user research to see if that is, indeed, the case.

Conclusion

  • We need user research of ToC.
  • We need to investigate our pipeline of counting users.
  • We need to rework our pipeline of counting users.
  • We need to fix sessions funnel to stop sending hundreds of duplicated events.

I'll leave it to @Charlotte to mark this ticket as resolved or not. Although I don't think there's much more I can do with what we have.

hey @mpopov I've dropped an invite on your calendar to have a talk through in order to make sure I fully understand what's going on. Thanks for all your work on this.

Hey @mpopov you can close it - thanks for talking through

Hey @JKatzWMF - FYI (see @mpopov's note above) we're not going to be able to get much data on our nav refresh. Sad panda.

@Charlotte @mpopov

  1. i saw in the ticket about the bug that hundreds of duplicate events were being sent.
    1. If we can detect them is there a way to correct for them? and
    2. if there are hundreds, what % of total events is that? The data doesn't have to be perfect, so if it's less than 5% of events of any given event-type, then I wouldn't let it get in the way of analysis.
  2. regarding the active device count, it seems like the google number isn't so hot and we need to measure it on our own. it feels like a glitchy thing with google's report. When you look at active devices by app versions you see that
    1. Minor point: the bump is relatively trivial when the y axis starts at 0
    2. this happened last time we did a release and that the impact seems to hit before the installs hit...this actually suggests that there is something about how google measures active devices that is reset upon an upgrade. We see a slow decline of devices and then a bump on a new release.

      Screen Shot 2019-01-22 at 11.30.38 AM.png (898×1 px, 159 KB)
    3. further evidence comes when you look at the installs and uninstall actions, which show a steady surplus over the last year. I am sure devices are going dark all the time, but I would be surprised if it surpassed the apparent surplus where the uninstall rate is ~66% of installs. It is hard to imagine that 1/3 of devices go dark every day. Maybe my logic is off.

      Screen Shot 2019-01-22 at 11.33.08 AM.png (926×1 px, 187 KB)

@Charlotte @mpopov

  1. i saw in the ticket about the bug that hundreds of duplicate events were being sent.
    1. If we can detect them is there a way to correct for them? and
    2. if there are hundreds, what % of total events is that? The data doesn't have to be perfect, so if it's less than 5% of events of any given event-type, then I wouldn't let it get in the way of analysis.

Looked into it and figured out how to de-duplicate so I finally have some numbers. (BTW each event is a summary of a user's session. So there is only one type of event and it's what I use for analysis of sessions.) Approx 80% of users are affected to varying degrees. 15% of users have double the actual events, 23% have 2-5 times the events, and 62% have more than 5 times the number of events they should. In beta there are 79K unique events and 1.06M duplicates of those events. In production it's 24K unique events and 329K duplicates of those events.

I'll apply the de-duplication technique to my existing analysis and post the results shortly.

@Charlotte: thanks for the ping! Right now my priorities are: the notifications analysis, SEO sameAs analysis, some Search query migration, and then this. I'm working on acquiring data for T213458 and it turned out to be much, much harder than I anticipated so that's creating some delays.

mpopov triaged this task as Low priority.

Finally closing this out:

average_session_length.png (720×1 px, 92 KB)

Looks like users were spending more time in the app on average after the NavUpdate

read_per_session.png (720×1 px, 53 KB)

But they were reading fewer articles per session.

toc_usage_rate.png (360×720 px, 25 KB)

Same notes on ToC widget usage as before.