Page MenuHomePhabricator

Identify typical time to first user interaction
Closed, ResolvedPublic

Description

This task encompasses the work required to answer the question: what is the typical time until the user initiates the first interaction with a link? Specifically, we'd like to know whether deferring loading of page preview JavaScript will impact or not impact most users.

To be able to use the data from the Popups schema, we take its pageLoaded event as the starting point.

More specifically, time to first link interaction after page load and first page preview after page load may be useful to know too.

Event Timeline

@ovasileva and @Jhernandez hinted we may already have this information.
Since we are using an artificial delay in showing page previews, we're keen to understand if we can remove that as part of any change to defer the load of the JS.

@Tbayer: Can you confirm that we can derive this from the data that we're collecting currently (for both the control and enabled buckets)? If so, would you be able to take on this task?

Yes, the Popups schema has both a pageLoaded event and events for every link interaction, so this is doable (assuming pageloaded is a good starting point to count this time from). Be aware though that it will need to be based on the server-side timestamp field which only has a resolution of one second (combined with the client-side totalInteractionTime field that has a millisecond resolution).

Also a note that the average is usually not the most useful momentum for performance questions like this. We should probably look at something like the tenth percentile or median instead. Perhaps the Performance team has recommendations or best practices here. I will see to plot the entire histogram anyway, again assuming that data with mere integer second resolution is still useful. If not, we may want to augment the Popups schema by a client-side millisecond resolution field to the link interactions events ("timeafterfirstpaint" or such).

Yes, the Popups schema has both a pageLoaded event and events for every link interaction, so this is doable (assuming pageloaded is a good starting point to count this time from). Be aware though that it will need to be based on the server-side timestamp field which only has a resolution of one second (combined with the client-side totalInteractionTime field that has a millisecond resolution).

In today's Sprint Kickoff - Reading Web meeting, we agreed that we should start with the data that we have and assess whether we need to increase its resolution once we have an initial report.

Perhaps the Performance team has recommendations or best practices here.

Let's look at the shape of the whole distribution, no single percentile tells the whole story.

Tbayer renamed this task from Identify average time to user interaction to Identify typical time to first user interaction.Nov 7 2017, 4:46 PM
Tbayer updated the task description. (Show Details)

Here is a histogram:

Histogram - time to first link interaction (Nov 1, 2017) .png (552×702 px, 44 KB)

As indicated above, the restriction to integer timestamps introduces some rounding errors, basically smearing out the graph a bit horizontally.

Data via the following query (based on all pageviews recorded in the Popups schema from November 1, I didn't bother splitting it by enwiki vs dewiki):

SELECT timetostart_bucket, COUNT(*) AS pageviews FROM (
  SELECT pagetoken, ROUND(MIN(timetointeractionstart)) AS timetostart_bucket FROM (
    SELECT 
    pageloads.pagetoken AS pagetoken,
    (CAST(linkinteractions.timestamp AS DOUBLE) 
    - CAST(pageloads.mints AS DOUBLE) 
    - 0.001*CAST(linkinteractions.totalinteractiontime AS DOUBLE) ) 
      AS timetointeractionstart
    FROM (
      SELECT event.pagetoken AS pagetoken,
      MIN(timestamp) AS mints  -- in case of duplicate pageload events, pick the earliest
      FROM tbayer.popups
      WHERE year = 2017 AND month = 11 AND day = 1
      AND event.action = 'pageLoaded'
      GROUP BY event.pagetoken ) AS pageloads
    JOIN (
      SELECT event.pagetoken, timestamp, event.totalinteractiontime, event.action
      FROM tbayer.popups
      WHERE year = 2017 AND month = 11 AND day = 1
      AND event.action != 'pageLoaded') AS linkinteractions
   ON pageloads.pagetoken = linkinteractions.pagetoken ) AS linkinteractions_with_timetostart
  GROUP by pagetoken) AS timetostartminima
GROUP BY timetostart_bucket
ORDER BY timetostart_bucket LIMIT 10000;

And here is the same data in form of a cumulative histogram, to make it easier to read out percentiles (e.g. the median is around 5 seconds, the tenth percentile is <0.5 seconds - again, subject to rounding errors):

Cumulative histogram - time to first link interaction (Nov 1, 2017) .png (552×708 px, 51 KB)

The high granularity is really unfortunate (but I get that's a limitation of EventLogging). The results are very interesting as-is, but the answer to whether or not we can afford some extra early loading might be different with ms precision.

I think we need to measure this metric specifically. For example using the client-side performance.now() monotonic clock, sending the data to statsv. It's very simple to set up, we (Performance-Team) can show how.

@Gilles: Agreed. Page Previews already sends a fair bit of data to statsv, so logging and visualizing this metric shouldn't be too much effort. Indeed, Performance-Team helped us set up and reviewed that dashboard way back when 🙂

I'm happy to own creating the task of adding this metric to Page Previews.

@Gilles: Agreed. Page Previews already sends a fair bit of data to statsv, so logging and visualizing this metric shouldn't be too much effort. Indeed, Performance-Team helped us set up and reviewed that dashboard way back when 🙂

Do we already have experience creating histograms such as the above in Grafana? (keeping in mind T179426#3737738 )
Also, would we aim to send this metric only for the first (earliest) link interaction, or select the minimum (per pageview) server-side using a page token?

Per Standup, I'll close this after I've written the follow-on task.

phuedx closed this task as Resolved.EditedNov 8 2017, 2:28 PM

Done. The follow-on task is in T180036: Instrument time to first user link interaction 🎉🎉🎉