Page MenuHomePhabricator

Better integration of NavigationTiming performance survey
Closed, InvalidPublic

Description

NavigationTiming calls the QuickSurveys API directly, rather than configuring a survey through the normal configuration channels. Lots of QuickSurveys logic is duplicated in this code base. Why is this done? Maybe just to couple with the performance oversampling condition? Is there a better way to integrate, which decouples the extensions and makes the survey configuration visible?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Gilles triaged this task as Medium priority.
Gilles moved this task from Inbox, needs triage to Doing (old) on the Performance-Team board.

NavigationTiming only samples a small fraction of requests, by default 1 in every 1000. It only makes sense to ask users about the performance of the page if the performance was measured by NavigationTiming (since the survey is to study the correlation with those metrics), which is why QuickSurveys is invoked from NavigationTiming and some light modifications of QuickSurveys were made to facilitate this.

The survey is configured via mediawiki config, like all others: https://github.com/wikimedia/operations-mediawiki-config/blob/133ca920681e374031c7c48c2ef1e0dd54ba3286/wmf-config/InitialiseSettings.php#L24601-L24620 it's only triggered slightly differently, to appear based on whether the current page was sampled by NavigationTiming.

I think that attempting to decouple further would require synchronising the sampling mechanism between the 2 extensions, which means duplicated code and a fragile relationship.

Can you point to specific code that is duplicated logic in the current setup?

Thanks for pointing out the configuration, that makes sense and it's a better situation than I was imagining at first glance. The lines that jumped out as duplicate logic are https://github.com/wikimedia/mediawiki-extensions-NavigationTiming/blob/c39cceb27476ee209dda64ecb9bde13e1bc84a0b/modules/ext.navigationTiming.js#L491-L504 , and are pretty harmless, just tests to see if we're on the main page, whether the page exists, and so on. It still causes debt because we have to stay in sync with QuickSurveys, but I don't feel the need to rock this boat if everyone else is happy with the integration.

That said, maybe in the long term we can accomplish this sort of effect by manipulating the experiment buckets. QuickSurveys calls mw.experiments.getBucket, and that module seems like a natural place to add an extensible system giving finer control over sampling. In the NavigationTiming case, our goal is to bucket users for surveys according to their logged-in status, and whether they were already selected for performance sampling. This interface could shrink down to nearly a boolean assignment, was the user selected for a survey or not, and this can be stored abstractly, as a named experiment bucket like "performance.survey". Once the bucket is assigned, it could be picked up by QuickSurveys without any direct coupling.

I'm mostly fishing for ideas here, please don't feel obliged to rewrite anything at this point!

NavigationTiming's sampling logic is more complex, though, as it supports different forms of oversampling according to different criteria, not just a general sampling factor:

https://github.com/wikimedia/mediawiki-extensions-NavigationTiming/blob/master/modules/ext.navigationTiming.js#L1148-L1175

So I'm not sure that buckets would suffice, and I don't know if any other system than NavigationTiming would care about an oversampling model.

Okay, makes sense. There's no urgency to do anything more at the moment... Thank you for discussing!