Page MenuHomePhabricator

Set up WebPageTest for synthetic testing
Closed, ResolvedPublic

Description

Documentation: https://wikitech.wikimedia.org/wiki/WebPageTest

Background

Today we collect performance metrics using RUM. That is super and helps us keep track of performance trends. Running our own synthetic testing (automatically testing pages in browsers) will help us find performance problems related to specific browsers, giving us better instruments when analyzing (and talking about) performance: HAR files, SpeedIndex (that is the best way today to show the above the fold content is loaded) & videos.

What makes WebPageTest especially good is that it's open-source and can handle Internet Explorer, Firefox, Chrome & Safari.

Why our own instance?

There's a public instance of WebPageTest where the limit is 200 page views per day. If we test one page first view and repeat view nine times, we can test 10 pages once a day. That is too low. Running our own instance also removes the limit of 9 runs per URL.

Setup our own instance(s)

WebPageTest contains of one web server (the main entry point) that can run on Linux/Windows and test agent (that actually runs the browser tests) that only runs on Windows.

Setting up WebPageTest instances can be quite a lot of work (even though the list of what needs to be done is not too long https://sites.google.com/a/webpagetest.org/docs/private-instances). WebPageTest is known for the lack/outdated documentation.

However, there are ready made AMIs on Amazon that we could could use. That will save us a lot of time on setup and will also add automatically up/down scaling of agents. By default an agent that hasn't been used in one hour will be shut down. It will also add the ability to run agents from different locations, something we can use in the future.

What kind of data will WebPageTest collect?

All the metrics we collect can be public (there's no secrets there, whoever who wants can collect them). It would be great if the instance could be publicly accessible. I think we should aim for that in the future but lets first set it up so it works. We can run the instance headless = no GUI for running tests, then use the API with an API-key. That way we can run the tests automatically that we want and the result will be public (if you know the URL).

Here's an example of WebPageTest run using Chrome for https://en.wikipedia.org/wiki/Barack_Obama:
http://www.webpagetest.org/result/150819_W0_19D3/

SPDY problems

We will have a problem in a way of that we will not have everything exactly how we want it with SPDY for different browsers until we change to HTTP2.

It's like this: WebPageTest has SPDY support for Chrome, meaning it will use SPDY and all the metrics will be right, except for sizes of different objects in the HAR file. That doesn't matter so much for us right now, but if we also wanna pick assets sizes from the run, we need to have workaround (and don't worry there is).

Firefox is using SPDY but the HAR/Waterfall graphs aren't generated because there's no SPDY decoder implemented for Firefox. But it will be there when we support HTTP2.

Internet Explorer 11 isn't supporting SPDY on older version of Windows (and that's what WPT uses).

The Safari version running on WPT is 7:ish meaning not supporting SPDY.

What we need to do on a high level

  • Setup our own instance of WebPageTest (security etc) and mount the logs dir to EBS (how do we do that?)
  • Automate run/trigger tests on WebPageTest (we can use sitespeed.io or the WebPageTest nodejs API wrapper )
  • Define a couple of URLs to start with. I think it good to keep the list small for the beginning. Talked with @ori and logged in/anonymous users and a couple pages should do as a start. I think it's important to keep it as simple as possible as a start just to get something up and running.
  • Define browsers and connectivity. We should use latest Chrome/Firefox and discuss how we do with Internet Explorer/Safari until WPT versions use SPDY or we switch to HTTP2. We should also run a browser that not will support SPDY/HTTP2 so we also keep track of that
  • Decide how many times we want to test each URL.
  • Push the data to Graphite
  • Find a easy way to map metrics in Graphite to runs in WebPageTest (so we from a specific run easy can look up the data in our WebPageTest instance).

Related Objects

StatusSubtypeAssignedTask
ResolvedJdlrobson
ResolvedPeter
Resolvedori
Resolved Spage
ResolvedPeter
DuplicateNone
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
Resolvedori
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter
Resolvedori
ResolvedPeter
ResolvedPeter
ResolvedPeter
ResolvedPeter

Event Timeline

Peter raised the priority of this task from to Needs Triage.
Peter updated the task description. (Show Details)
Peter added a project: Performance-Team.
Peter added subscribers: Peter, ori.
ori triaged this task as Medium priority.Aug 20 2015, 12:25 AM
ori updated the task description. (Show Details)
ori set Security to None.
ori moved this task from Inbox, needs triage to Doing (old) on the Performance-Team board.
ori renamed this task from Setup WebPageTest for synthetic testing to Set up WebPageTest for synthetic testing.Aug 20 2015, 6:25 PM
Peter updated the task description. (Show Details)

Would be nice to define for how long time we will keep the test, using the auto deletion feature of S3.

The current example Jenkins job @Peter and I set up today at https://integration.wikimedia.org/ci/job/performance-webpagetest/10/console produces the following statsv url:

//www.wikimedia.org/beacon/statsv?webpagetest.enwiki.Facebook.anonymous.ie.firstView.SpeedIndex=1808ms&webpagetest.enwiki.Facebook.anonymous.ie.firstView.render=1711ms&webpagetest.enwiki.Facebook.anonymous.ie.firstView.TTFB=344ms&webpagetest.enwiki.Facebook.anonymous.ie.firstView.fullyLoaded=7639ms&webpagetest.enwiki.Facebook.anonymous.ie.firstView.mwLoadStart=1678ms&webpagetest.enwiki.Facebook.anonymous.ie.firstView.mwLoadEnd=2937ms&webpagetest.enwiki.Facebook.anonymous.ie.repeatView.SpeedIndex=1684ms&webpagetest.enwiki.Facebook.anonymous.ie.repeatView.render=1098ms&webpagetest.enwiki.Facebook.anonymous.ie.repeatView.TTFB=237ms&webpagetest.enwiki.Facebook.anonymous.ie.repeatView.fullyLoaded=5887ms&webpagetest.enwiki.Facebook.anonymous.ie.repeatView.mwLoadStart=1024ms&webpagetest.enwiki.Facebook.anonymous.ie.repeatView.mwLoadEnd=5312ms

That's 853 characters (when expanded to https). So we'll need to keep url size in mind (currently 1000) and potentially require splitting up into multiple requests.

Peter raised the priority of this task from Medium to High.Oct 5 2015, 11:13 AM

Yes think so, but I don't follow why we don't get a straight line. This isn't perfect right now because I things get cached on the first hit after we login (the redirect). Think to be sure to know if the size changes for logged in users, we should measure the whole login step (and not the next access that we do today). Let me add task to check out what values we will have then. Would be nice to have a number that are easy to alert on.

Screen Shot 2015-11-07 at 8.12.04 PM.png (518×578 px, 50 KB)

It turns out the bulk of the JavaScript code was loaded outside the critical path, via mw.loader.load(). So the WPT run could fail to pick it up if it terminates too early.

We should change when it terminates then, it's configurable. I'll look into what's default.

Re https://lists.wikimedia.org/pipermail/wikitech-l/2015-December/084257.html , thank you a lot to everyone involved: the new dashboard is great (snapshot for the archives: F3058111).

There was a small scare about a sudden increase of the SpeedIndex value across the board: https://grafana.wikimedia.org/dashboard/db/webpagetest But it was entirely explained by the fundraising banner, which doesn't appear immediately on pageload.

I'll note that this is not an artificial performance degradation but a very real one, which we should measure correctly in order to estimate the cost of fundraising. I'm very happy that the speed index is able to measure the degradation caused by centralnotice banners, I'd call it a huge bonus benefit.

The next step would be to either find a way to measure the real world, on the field speed index; or make webpagetest simulate an average visit behaviour (e.g. X pages opened for each of Y visits in a month over Z wikis) to try to keep into account the hiding cookies etc.

@Nemo_bis there's RUMSpeedIndex that can be used to calculate the SpeedIndex using Javascript but we will be shipping a lot extra bytes to do it and the values aren't perfect as doing with a video.

However: when I tried it out before the metrics looks really good for some sites and not good for others. Haven't spent any time evaluating if it would work for us.