Page MenuHomePhabricator

EventLogging needs to enque events to avoid draining users' battery on mobile
Closed, ResolvedPublic8 Estimated Story Points

Description

Similar to how it is done in the statsv client we should enque EL events so we are not sending frequent beacons. See statsv code: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikimediaEvents/+/500046/

This change is important in the light of baseline metrics like session length that might be sending pings every N seconds.
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/SessionLength

Note that we are not talking about batching events but rather enqueuing them, they are not batched into a single network call, but rather the network calls happen around the same time. This should improves battery usage on mobile as the times at which we wake up radio and use network should be reduced

Event Timeline

fdans moved this task from Incoming to Smart Tools for Better Data on the Analytics board.
Nuria added a project: Analytics-Kanban.
Nuria added a subscriber: Gilles.

Ping @Gilles: work on this to start second week of July

@Krinkle so I'm looking at this and feeling a little uncomfortable about basically duplicating the logic from WikimediaEvents.

What do you think about an EventQueue class in the eventLogging core module? It would do common things (managing the timer, schedule, and handle pagehide/visibilitychange). You would pass logic to it for dispatch. So we'd refactor WikimediaEvents to use it and also use it for Event Logging to address this task here.

Yes, a common interface for this would make sense. I'd go further and actually also make it a singleton so as to not have multiple instances tracking and reflecting the same in-browser state. Probably something generic callback-based, like (mw.)requestIdleCallback and (window.)requestAnimationCallback do already.

Hm, if it's a singleton then clients would have to namespace events. Because the statsv client wants to batch events and send them together while the EventLogging client just wants to send them one at a time. So dispatch would be:

dispatch ( namespace ) {
    // get events in { namespace } as a list
    // call a specific callback like { callbacks[ namespace ] } with the list
    // get back a list of URLs
    // loop over the list and call sendBeacon
}

That would allow the statsv client to squish the list of events into a list of one URL, and the EventLogging client to $.map it into another list.

I think it would be cleaner to have separate instances than have namespaces.

If it is a singleton there would be 1 queue but in this case the statsd queue and EL queue are different (and so is the dispatch method for either) so the queue management and dispatch would need to happen outside the singleton so it ends up being a bit of fake-singleton no? (totally could be missing some well -stablished mw conventions here)

Change 524575 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[mediawiki/extensions/EventLogging@master] [WIP] Refactor queue/batching of events from statsv to eventLog module.

https://gerrit.wikimedia.org/r/524575

Change 524576 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[mediawiki/extensions/WikimediaEvents@master] [WIP] Refactoring queue/batching of events to eventLog module

https://gerrit.wikimedia.org/r/524576

If it is a singleton there would be 1 queue but in this case the statsd queue and EL queue are different (and so is the dispatch method for either) [..]

My thinking was that this mechanism is more low level than that. Purely callback based, like setTimeout or requestIdleCallback. E.g. a callback that (based on current logic) is debounced by 2 seconds, but resolves earlier based on certain events. What you then do with the callback is the caller's decision. So statsd would indeed still have its own array of metrics, and EventLogging core its array of beacon urls. The callback is then to prompt the dispatching call in whatever each each handles that.

Nuria set the point value for this task to 8.Jul 22 2019, 8:48 PM

Is there any production instrumentation that relies on the time at which the beacon request arrived at the edge (the dt field in the Hive event.* tables IIRC)? My apologies if you've already looked into this.

@phuedx
no, there shouldn't be anything relying on that field

Change 524575 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[mediawiki/extensions/EventLogging@master] Add background queue with simple interface

https://gerrit.wikimedia.org/r/524575

Change 524575 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[mediawiki/extensions/EventLogging@master] Add background queue with simple interface

https://gerrit.wikimedia.org/r/524575

Quarter has rolled over once or twice since we started, not currently in my goals any more and likely won't have time for this in what's left of December. Putting up first thing for FY2019–20 Q3 in January.

Happy New Year, can we work on getting this out this quarter? :)

Change 524575 merged by jenkins-bot:
[mediawiki/extensions/EventLogging@master] Add background queue with simple interface

https://gerrit.wikimedia.org/r/524575

Change 524576 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] statsd: Refactor queue handling to mw.eventLog

https://gerrit.wikimedia.org/r/524576

Moved to done, Nuria likes to look these over before she resolves them. But yes, no more work as far as I know.