Page MenuHomePhabricator

Analyze time to first link interaction
Closed, ResolvedPublic


Repeat T179426: Identify typical time to first user interaction (in particular, generate a histogram analogous to T179426#3741859), now that we have more precise data available from the new instrumentaton (T180036). The purpose is to facilitate decisionmaking on T176211: Page Previews could load less JS on pageload.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 30 2018, 4:53 PM
ovasileva triaged this task as Medium priority.Feb 8 2018, 3:08 PM

Here is an initial histogram (covering the first 20 seconds, may still need some tweaks). I'll do a cumulative one too like last time.

Data via

SELECT timetostart_bucket, COUNT(*) AS pageviews FROM (
  SELECT pagetoken, ROUND(MIN(timetointeractionstart)/100) AS timetostart_bucket FROM (
    event.pagetoken AS pagetoken,
    event.timestamp - event.totalInteractionTime AS timetointeractionstart
    FROM event.popups
    WHERE ((month = 12 AND day >= 21) OR (month = 1) OR (month = 2 AND day <=14) )
    AND event.action != 'pageLoaded') AS linkinteractions
  GROUP by pagetoken) AS timetostartminima
GROUP BY timetostart_bucket
ORDER BY timetostart_bucket LIMIT 10000

Here is the cumulative version of the above chart:

NB: frequencies refer to the set of all pageviews where a link interaction occurred, not including pageviews without any link interaction. If that's of interest too, I can produce the same charts with the y-axis changed to that baseline.

Gilles added a subscriber: Gilles.EditedFeb 26 2018, 9:23 PM

Which definition of "pageloaded" is used for the origin of this graph?

Which definition of "pageloaded" is used for the origin of this graph?

That's really the origin of, which (as documented in the schema and discussed earlier at T180036 etc.) provides the end time recorded for the link interaction, from which we subtract the interaction duration to get the start time. The x-axis legend was adapted from T179426#3741809 etc.; should have updated the wording for precision. That said, now I'm curious about the size of the difference - will check the distribution of timestamps (i..e. for the schema's pageLoaded event too.

Just popping in to drop this link to a detailed explanation of the time origin of HighResTimeStamps:

ovasileva moved this task from Backlog to For Review on the Page-Previews board.Mar 15 2018, 3:55 PM

Tilman said he wants to write a comment and resolve this task but looks like this is done.

@Tbayer Is this task resolvable?

Restricted Application edited projects, added Readers-Web-Backlog; removed Readers-Web-Backlog (Tracking). · View Herald TranscriptApr 11 2018, 7:03 PM
Tbayer moved this task from Triage to Next Up on the Product-Analytics board.Apr 24 2018, 8:26 PM
Tbayer moved this task from Next Up to Doing on the Product-Analytics board.
ovasileva moved this task from For Review to Done on the Page-Previews board.Jul 25 2018, 8:22 AM
Tbayer closed this task as Resolved.Feb 22 2019, 1:51 AM

Following up here:

To recap, the purpose of this analysis was to aid decisionmaking about T176211: Page Previews could load less JS on pageload, which was achieved (see T176211#4024686 etc. - that change was deployed last year).

As discussed above, this data refers to the time from the origin to the first hover, and we should distinguish between that and other definitions of "pageload", in particular the schema's own pageLoaded event. Above I had mused (T186016#4003680) that it would be interesting to know how much after that origin time the pageLoaded event occurs, and a couple of weeks later @Gilles made what looks like another related comment at T176211#4061121. Overall the data was still seen as good enough for last year's decision in that discussion. But since it may still be of interest in the future (perhaps also in context of T190037), here is how that distribution looks like:

Data via

SELECT timetoload_bucket, COUNT(*) AS pageviews FROM (
  SELECT ROUND(event.timestamp/100) AS timetoload_bucket 
  FROM event.popups
  WHERE ((month = 12 AND day >= 21) OR (month = 1) OR (month = 2 AND day <=14) )
  AND event.action = 'pageLoaded') AS pageloads
GROUP BY timetoload_bucket
ORDER BY timetoload_bucket LIMIT 10000

Also, here is the previous chart with the meaning of the x-axis clarified:

PS: There is a detailed discussion of how various time-related fields in the Popups schema are generated at T182314#3956099 .