Wed, Sep 20
I've added a graph in the drilldown dashboard:
I've added inactive alerts for SpeedIndex and firstVisualChange as a start.
I really like this new view. The gap between 95 and 99% is interesting, is kind of a quality metric showing us where we have room for improvements :)
I've added a visual complete graph:
This is done in WPT and up and running. I'll add new graphs today.
Tue, Sep 19
https://github.com/WPO-Foundation/wptagent/issues/56 is a known issue. We can help debugging it by turning on the log, but at the moment the WebPageTest wrapper API is not supporting that, so I'll start to see if I can just add it, then I need rerun the tests, collect all logs and analyze them.
Mon, Sep 18
So ... Benedikt reach out today and he will publish the code tomorrow. That means we could move on with either WebPageReplay or mahimahi.
I've added https://github.com/WPO-Foundation/wptagent/issues/56 for the problem that the tests take so long time and then https://github.com/WPO-Foundation/wptagent/issues/57 best practice to update the browser versions (right now it is locked to the one when the AWS image was created).
This is kind of worrying. I'm testing https://gerrit.wikimedia.org/r/#/c/378658/ - 9 urls per script, one run per script (3 scripts). So it should test 27 URLs. Testing the first 9 takes 6 min but then something happens, either I don't get them through or the full tests takes over an hour. I'll fill a couple of upstream bugs during the day. I think the problem is because of a "smart" check that verifies that the CPU is not running to high before we start the next test.
The ideas was that we didn't need to change coal at at the same moment (pickup that new metrics) but we missed something, I'm not familiar how it exactly works.
I've tested the new version of WebPageReplay:
I switched to use Windows. Even though it seems we have the same problem there, I'll not investigate it since we will only run this for a couple of weeks. If we wanna test from different locations in the future then we need to look into it.
Fri, Sep 15
No it is the same thing with Windows. Four runs take over one hour. I'll retest again on Monday. Maybe it's something going on with having the server running on one location and the agent on others (far) far away. hmm.
So I've been testing now on Linux it looks kind of worrying, I'll switch to Windows and test there.
Thu, Sep 14
The TTI isn't working for us on our Windows agent, and it seems that it is not only we that have the problem: https://github.com/WPO-Foundation/webpagetest/issues/907
Here's info about the new version of WebPageReplay: https://docs.google.com/document/d/1EZSgnZnkaHOK6IxAvhmt7fEBW0eZjvPiYw4sZyUspU0/edit#heading=h.5jheelvzm4w2
Wed, Sep 13
It looks like 95 vs 99 is the way to see the diff between banners. We should probably look at 95% (good find @Gilles)
Let us close this for now and then create a new re-open if we start to create our own mobile performance testing.
Could https://github.com/Mozilla-TWQA/Hasal help us?
Tue, Sep 12
We should just hope a proxy will help us.
Let us wait for FF 57, if it is still the issue we can file a bug for Firefox.
This is built in mahimahi. If we use webpagereplay we can do it based on the os. For linux we can use netem.
We will not get any extra information out if this: The HTML is never cached. JS/CSS for 5 minutes, images never.
This is the feed: https://phabricator.wikimedia.org/phame/blog/feed/7/
Let us see how we can implement this for Marvin.
We know it but there's nothing we will do about it for now.
Moving to proxy in the future.
We can just skip to log the URL when it fails. Everything else works ok.
Let us push Performance Inspector first and then think about potential changes.
Maybe we can add this to the Performance inspector?
Let us add this as a beta feature as a start.
The mwLoadStart doesn't gives us an extra value since we made the loading async. Instead we should change the mwLoadEnd to be relative when the page starts to load. Let me update the description later.
Let me try Windows vs Linux and make sure Linux have the same run time. When I fire away the tests for Linux it seems like it hangs ...
I think realistically we can wait with this after FF 57 is released and then ask for help if the problem still exists.
I've changed those to the Linux instance, think that will work fine and a good test for us. We can make this run maybe every 4 hour or something like that, then we can have Mumbai and Tokyo run at the same time.
We don't act on the values from WebPageTest.org so we can just close this.
Let us implement https://phabricator.wikimedia.org/T164422#3407116 on a new Linux instance.
@Jhernandez it's because of the banner, check the screenshots http://wpt.wmftest.org/video/compare.php?tests=170912_KZ_6J,170829_Y5_R5 (comparing before September and current).
Seems like adding long tasks will break Firefox at the moment: https://bugzilla.mozilla.org/show_bug.cgi?id=1398477
Wed, Sep 6
The internal URLs are disabled now, so I can move on with the testing.
Tue, Sep 5
If we have time at the offsite we could sit down and check how we test this locally with WebPageTest or Browsertime to get more reliable metrics (running X number of tests, taking the median etc) and write about it at Wikitech, it would be nice to have a work flow for testing things locally. Or maybe this should actually be task for us?
Most of the bugs are fixed, but we still have "Internal FF URLs are picked up (https://tracking-protection.cdn.mozilla.net/...)" - https://github.com/WPO-Foundation/wptagent/issues/40 that blocks me from more testing.
Mon, Sep 4
And the final bugs for now:
Second view picks up first URL for Firefox http://wpt.wmftest.org/result/170904_MC_AB/1/details/#waterfall_view_step1:
Next problem: When you login a user to Firefox we get an extra request, look at that second request:
- The things I've seen so far testing Firefox Linux
Fri, Sep 1
It worked now (spinning up a Linux instance with FF) so I'll continue to verify that the data seems ok next week.
It's been updated now, so I'll update the server (by making sure we get the latest /var/www/webpagetest/www/settings/ec2_locations.ini from Github and setting c4.large as default).
I've been testing the Firefox version locally this week and it seems ok, got some hick-ups with SpeedIndex/visual metrics but I think that could be Mac OS X related. I've filed an issue for adding Firefox to the ec2 instances https://github.com/WPO-Foundation/webpagetest/issues/930 (I've missed that they lacked Firefox).
Tue, Aug 29
Just adding how I would look into this in WebPageTest to make sure it is ok:
It also broken for mobile. The API get a 500 but the when I access our server http://wpt.wmftest.org/result/170829_V3_8K/ it looks good. I'll update the task. But then maybe we don't need this. In the beginning we had the idea that the devices on WPT give us something extra, but they vary too much so in the long run, we need to take it home or have a way to use dedicated devices.
Hmm this seems more broken than before, I need to check that I didn't missed anything when I updated the conf before the summer. Runnning from my machine, setting location to "Dulles" gives me
Let me try this again, it was a long time. If it still don't work I'll ping the the original issue.
Yep let us close this.
I haven't seen this after I git back from vacation (a week ago). Maybe it was fixed when we moved to the dedicated slave(s)?
This will not happen if we don't do it ourselves. We could either do it or close it add it as a downside using WebPageTest.
I pinged the Github issue again today.
This happened again some time ago: T173362
Mon, Aug 28
I added the steps here for running on Mac: https://wikitech.wikimedia.org/wiki/WebPageTestLocal
I finally got an agent and server working locally on my Mac so I could send my first PR fixing the HAR in Firefox. I'll document my setup on Wikitech.
Wed, Aug 23
Aug 23 2017
I forgot to add that I tested earlier this week and Firefox works now (at least working as in running) so that it's pretty cool. One first step to test it out would be to setup a new agent and run the same tests under another key and we can watch that everything is ok.
Yes you are always one step ahead :) I've most things ready by wait until the tests lands, so I can add my own make sure I don't break anything. Thanks for the help @Krinkle
Aug 21 2017
Server will report absolute values relative to fetchStart instead of navigationStart.
Synced with Timo last Friday and I will start with this now. A couple of things when I started to look into it: