Page MenuHomePhabricator

Separate NavigationTiming data by deployment group
Closed, ResolvedPublic

Description

Another way to catch problems like T112401

Event Timeline

ori raised the priority of this task from to Medium.
ori updated the task description. (Show Details)
ori subscribed.
ori raised the priority of this task from Medium to High.Sep 14 2015, 6:23 PM
ori added a project: Performance-Team.
ori set Security to None.

There is no practical means for querying a wiki's deployment group (there is no stable interface for .dblist files in operations/mediawiki-config.git), and the groups are not permanent enough to hard-code. So this is not feasible currently. Let's re-open this if and when we have a queryable deployment system.

What do you mean by "stable interface"? There's a group0.dblist, group1.dblist, and we could create a group2.dblist if we really needed. And there's PHP code to read and parse those files.

But more importantly, what alternative tools/solutions can we use to make sure regressions like T112401 don't happen again?

Tentatively re-opening.

What do you mean by "stable interface"? There's a group0.dblist, group1.dblist, and we could create a group2.dblist if we really needed. And there's PHP code to read and parse those files.

Yes, but we need the information on hafnium in a very lightweight Python process that processes Navigation Timing packets in real-time to send further to statsd/graphite.

A few options I see in the current infrastructure:

  • Change the navtiming.py subscriber to maintain a copy of these dblist files, sync them from time to time (to disk and to memory), and have it do lookups for each packet based on the EventLogging 'wiki' field (see https://meta.wikimedia.org/wiki/Schema:EventCapsule).
  • Export this information client-side as part of EventLogging (e.g. via a mw.config variable in the startup module somehow). And log it as part of the NavigationTiming packet.
  • Change the core EventLogging server-side code to maintain a copy of the dblist files all sugar packets with this information as part of the EventCapsule.

Change 273988 had a related patch set uploaded (by Ori.livneh):
Add wgVersion to SaveTiming and NavigationTiming events

https://gerrit.wikimedia.org/r/273988

Change 273990 had a related patch set uploaded (by Ori.livneh):
Report save timing by MediaWiki version

https://gerrit.wikimedia.org/r/273990

Change 273988 merged by jenkins-bot:
Add wgVersion to SaveTiming and NavigationTiming events

https://gerrit.wikimedia.org/r/273988

Change 273990 merged by Ori.livneh:
Report save timing by MediaWiki version

https://gerrit.wikimedia.org/r/273990