Page MenuHomePhabricator

Instrument time to first user link interaction
Closed, ResolvedPublic3 Estimated Story Points

Description

Background

In T179426: Identify typical time to first user interaction, we used the current Page Previews EventLogging instrumentation to try to determine the typical time to first user link interaction (see T179426#3741809 and T179426#3741859). Unfortunately, the use of the server-side timestamp, which is only accurate to the second, was too high granularity in order to move forward with T179426: Identify typical time to first user interaction.

@Gilles recommended that we specifically measure this metric with the high-precision (milliseconds accurate to 5 microseconds), monotonic clock exposed via performance.now().

Plan

Per T180036#374696, we'll add a high-precision timestamp to the existing EventLogging instrumentation. So, we'll:

  • Add an event_timestamp field to the Popups schema.
  • In the client, set the field to performance.now for every event we send.

Developer Notes

  1. The field could/should be added in [the src/reducers/eventLogging.js#createEvent function](https://github.com/wikimedia/mediawiki-extensions-Popups/blob/e1be05d73f308fcdc7723bcd8792b61fd6a69f77/src/reducers/eventLogging.js#L35-L59).
  2. All events have event_timestamp values added to them (in case of link interaction events, referring to the end of the interaction rather than its start). The timestamps should only be retrieved using performance.now when it's available.
    • This is mostly a point of principle: the reducers should only be combining data with existing state.
  3. Per T180036#3749812, if performance.now is unavailable for some reason, send undefined or null.

AC

  • Whenever possible, high-precision timestamps are sent with all Popups events.

Testing Plan

  • EventLogging for popups is disabled on production so we will need to test this on the beta cluster
  • On the beta cluster bucket yourself by passing the debug=true parameter
  • Every event should include a timestamp field. To verify this open the developer console and inspect the network tab (you can filter by event). Check all possible event types to see if this is true and the number is always increasing.
  • We should check for a variety of browsers that we care about.

Sign off steps

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Actually, based on @Tbayer's remark in the other thread, adding a field to EventLogging would give us more correct data. With graphite as the storage mechanism by way of statsd, the aggregation of data would get in the way of generating correct histograms over a long period of time. I.e. we could be combining per-minute percentiles instead of actual percentiles over a longer period, which is an approximation.

This approximation is something we live with with our performance metrics, but for the purpose of this study I think we can afford to compute a mathematically correct histogram by way of EventLogging, since we're already storing all that data.

A solution for proper histograms and heatmaps in Grafana is to use Prometheus as the storage layer for those metrics. But we haven't looked at having a statsv-like endpoint for that yet. I've just filed a task for it: T180105

Thanks for the additional detail and reasoning @Tbayer and @Gilles. I'll update the task accordingly.

ovasileva moved this task from Incoming to Needs Prioritization on the Web-Team-Backlog board.

Is performance.now always available? If not, for completeness what should we send in its absence. I'm guessing undefined should be fine?

Is performance.now always available?

Considering that the Popups schema is restricted to sendBeacon capable user agents anyway,
and comparing https://caniuse.com/#feat=beacon with the "Browser compatibility" section at https://developer.mozilla.org/en-US/docs/Web/API/Performance/now , I guess that the theoretical answer is yes, in this context.

If not, for completeness what should we send in its absence. I'm guessing undefined should be fine?

Something that ends up as NULL in the EventLogging Hive table might be best.

phuedx updated the task description. (Show Details)

getBaseData() is called once during popups initialization, then every event will have an extra event_timestamp:{timestamp}. Then every event will have the same event_timestamp value (as getBaseData() is called only once).

Developer notes:point 1 is IMHO valid only for the first event, then we will have to update event_timestamp in eventLogging reducer.

@pmiazga: Thanks for the feedback. Don't hesitate to update the task if you spot an error like that 🙂

phuedx set the point value for this task to 3.

Change 391850 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[mediawiki/extensions/Popups@master] Schema:Popups - use revision 17430287

https://gerrit.wikimedia.org/r/391850

I've +1ed the change but would appreciate @phuedx @pmiazga or @Jhernandez to take a look at well.

Change 391850 merged by jenkins-bot:
[mediawiki/extensions/Popups@master] Schema:Popups - use revision 17430287

https://gerrit.wikimedia.org/r/391850

bmansurov subscribed.

Let's QA this technical task.

@bmansurov let's come up with testing criteria for this card on Monday. Also who do we want to do this QA? Let's make that explicit. Anthony? Tilman? A developer?

I was thinking a developer (and Tilman, if he wants) should Q/A this.

@bmansurov let's come up with testing criteria for this card on Monday. Also who do we want to do this QA? Let's make that explicit. Anthony? Tilman? A developer?

QA for instrumentation is a wider topic (take a look at our quantitative testing process; we also discussed this for a while at last month's offsite, including regarding the possible involvement of Anthony). Here is what we did it in two recent examples:

I should be able to make time for such a call later today or in the next few days, feel free to ping me on IRC or put something on my calendar.

@Tbayer and I are going to meet to QA this task.

I've arranged a meeting over Hangouts for 5:30 PM GMT on Monday, 27th November.

@Tbayer and I met and QA'd the implementation and are satisfied. We did the QA with Chrome (62.0.3202.94) but would like to quickly redo it with Firefox too.

I've re-done the QA in Firefox (57.0) and it LGTM.

For posterity, @Tbayer and I tested that:

  • The pageLoaded, dismissed, dwelledButAbandoned, and opened events all had the timestamp property.
  • The value of the timestamp property always increased (and was reasonable).
  • The rest of the instrumentation remained unchanged, e.g. the sessionToken and pageToken properties were shared between events.
phuedx updated the task description. (Show Details)

All the AC are met and the pre-sign off steps have been taken.

🎉🎉🎉

During my meeting with @Tbayer, I did note that the timestamp field wasn't always an integer (like other similar fields in the schema) but we both agreed that this shouldn't be an issue. It turns out that this actually breaks the refinement process while importing event data into Hive (see T182000: Popups timestamp field contains multiple types for more detail). I'm reopening this issue as it can't be called done until the data can be imported, right?

The subtask (T182000) has been resolved.