Page MenuHomePhabricator

navtiming: firstPaint.mobile metric broken on wmf.24
Closed, ResolvedPublic

Description

When the wmf.24 branch went out, two navtiming alerts started firing:

The alert for the metric value:

The apparent speed regression is just a side-effect of another issue, which is that there is no data:

Details

Related Gerrit Patches:
mediawiki/extensions/NavigationTiming : wmf/1.33.0-wmf.24Add isAnon and mobileMode to PaintTiming context
mediawiki/extensions/NavigationTiming : masterAdd isAnon and mobileMode to PaintTiming context

Event Timeline

Krinkle created this task.Apr 4 2019, 9:14 PM
Restricted Application added a project: Performance-Team. · View Herald TranscriptApr 4 2019, 9:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Commits to Navigation Timing that are new in wmf.24:

I don't see a regression in EventLogging in take, which suggests the client is still sending data at the same rate, and still in a valid format:

And the new PaintTiming schema is also working:

This suggests it might be something in the navtiming.py service instead, which might not be processing those events correctly. Note that the firstPaint.overall metric is still receiving data, but the buckets such as .mobile.anonymous were not.

Gilles claimed this task.Apr 5 2019, 5:09 AM
Gilles triaged this task as High priority.
Gilles added a comment.Apr 5 2019, 5:33 AM

frontend.navtiming2.by_browser.*.* worked throughout the incident

frontend.navtiming2.desktop.overall was twice was it was supposed to during the incident (getting the mobile ones as well):

frontend.navtiming2.desktop.authenticated included all anonymous as well:

It seems like instead of differentiating "site" and "auth" by their usual values, it assumed everything was desktop/authenticated. I think I know why: it must be getting that information from values inside the schema and not from the capsule. Just like the firstPaint schema needs oversampling information, I bet it needs the isAnon and mobileMode fields too.

Yep:

if 'mobileMode' in event:
            if event['mobileMode'] == 'stable':
                site = 'mobile'
            else:
                site = 'mobile-beta'
        else:
            site = 'desktop'
auth = 'anonymous' if event.get('isAnon') else 'authenticated'

Change 501484 had a related patch set uploaded (by Gilles; owner: Gilles):
[mediawiki/extensions/NavigationTiming@master] Add isAnon and mobileMode to PaintTiming context

https://gerrit.wikimedia.org/r/501484

Change 501604 had a related patch set uploaded (by Krinkle; owner: Gilles):
[mediawiki/extensions/NavigationTiming@wmf/1.33.0-wmf.24] Add isAnon and mobileMode to PaintTiming context

https://gerrit.wikimedia.org/r/501604

Change 501604 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@wmf/1.33.0-wmf.24] Add isAnon and mobileMode to PaintTiming context

https://gerrit.wikimedia.org/r/501604

Mentioned in SAL (#wikimedia-operations) [2019-04-05T15:57:55Z] <krinkle@deploy1001> Synchronized php-1.33.0-wmf.24/extensions/NavigationTiming/: I6b23be850d35c7d19 / T220156 (duration: 01m 00s)

Change 501484 merged by jenkins-bot:
[mediawiki/extensions/NavigationTiming@master] Add isAnon and mobileMode to PaintTiming context

https://gerrit.wikimedia.org/r/501484

Krinkle closed this task as Resolved.Apr 8 2019, 12:35 PM