Page MenuHomePhabricator

Experimentation Lab Performance review
Open, MediumPublic

Description

Request

The Experiment Platform Team would like a performance review of how our proposed system would impact user-perceived web performance. This diagram describes the system at a medium level of detail.

For the purpose of this review we probably want to focus on the “Varnish” and “MediaWiki” boxes. We believe these are the only two places where we’re making changes that could have an impact on user-perceived web performance.

The new code running on our Varnishes will be the Edge Unique cookie implementation

To MediaWiki we are mostly deploying code from https://www.mediawiki.org/wiki/Extension:MetricsPlatform and the client libraries.

  • What work has been done to ensure the best possible performance of the feature?

We are testing the Edge Unique implementation as it rolls out next week. On the MW side, we’ve kept xLab out of request processing by pushing experiment configuration to a known place and robustly considering no experiments are running if there are problems getting the configuration performantly. We will test our extension and client library code once we can interact with edge uniques.

  • What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?

To our eyes just the Varnish request handling getting this additional thing to do on every request.

  • Are there potential optimizations that haven't been performed yet?

TBD after performance and load testing, happening soon.

  • Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far.

(starting to collect the list here)

Event Timeline

MSantos subscribed.

@Milimetric should we tag Traffic for the Varnish piece?

I'm moving this to needs input in our workboard while the performance tests are ongoing.

Milimetric triaged this task as Medium priority.

Thanks! I'll pass it back to you all when we're done with our performance testing.

moving back to backlog. I shifted this to be under the standard metrics Epic (see update just above). The performance profile will probably be different in that implementation and makes more sense to do after / during that refactor.