Fri, Oct 19
Thu, Oct 18
This is done now, I'll add some graphs later on,
Wed, Oct 17
Sometimes I feel I miss a more high level version of the Chrome changelog: https://chromium.googlesource.com/chromium/src/+log/69.0.3497.100..70.0.3538.67?pretty=fuller&n=10000
Fri, Oct 12
Moved the last one to c5.large (and updated the docs) just now.
Thu, Oct 11
5.3.0 has been released.
I've updated the Firefox server to c5.xlarge and changed ewiki to also run on c5 series (c5.large). That one is also cheaper and a little bit faster.
Wed, Oct 10
Mon, Oct 8
Enabled them just now. 40 ms/40 points for Speed Index for all tests for now.
This is what it looks like going back 90 days:
It is really one page that has the problem: The facebook page. I can see that the difference between runs are that sometime we spend 100ms+ more on UpdateLayoutTree.
Fri, Oct 5
Thu, Oct 4
Woho! Templates in Grafana was released in 5.3.0 beta 2 (there's a beta 3 out there now). This means that when we upgrade to 5.3.0 we could potentially start using the Grafana API.
Let us focus on:
Let us do T195233 instead
Wed, Oct 3
Tue, Oct 2
I've added a script that stores all error logs (no so far) in the home dir as error.log.
Mon, Oct 1
I deployed on a new server today, updated the docs. Lets keep running for a while and see what happens to the metrics (I can already see that the new server is faster).
I've added them all but will keep them inactive until I can spot good thresholds https://grafana.wikimedia.org/dashboard/db/webpagereplay-desktop-alerts
I've enabled them First Visual Change and Speed Index for enwiki on Desktop. Gonna wait until tomorrow to make sure everything is ok and then and then for beta, group0 and group 1 too.
Fri, Sep 28
A.k.a when I'm happy I feel that every web site is faaast :)
Yep, you are right.
This is better now, let us close it.
I've looked at the RUM metrics and see the same thing there:
I've looked at the RUM data and I'm pretty sure this is a Chrome regression:
Thu, Sep 27
I've deployed Firefox tests permanently on a c4.xlarge T205246 and it looks good so far.
Fixed this early this week. I've removed one c4.large and updated the docs https://wikitech.wikimedia.org/wiki/Performance/WebPageReplay
The alerts fired but we didn't act on them and then 7 days later it got back to the new "normal":
Mon, Sep 24
I removed the ones that was unused. Please let me know if you need help with setting up new dashboards!
@Gilles I've added the User Timing the WebPageTest dashboard: https://grafana.wikimedia.org/dashboard/db/webpagetest-drilldown?panelId=47&fullscreen&orgId=1
Or rather just move the enwiki tests to one of the others, that's the easiest way to just test.
It doesn't depend on browser version. Let me try to deploy on another server.
There's bug in the script I created, I'll fix that upstream in Browsertime. For now I added an extra element, the CentralNotice div. I'll test it out on the server running the English wikipedia.
Sep 22 2018
Sep 21 2018
Thank you @Aklapper !
Sep 20 2018
Yep the metrics are more unstable with Chrome 69 on the Chinese wiki by time (haven't calculated by %). For all wikis the mdev is higher with 69 but most wikis (I think all except the English Wikipedia) still have stable medians.
I've updated the settings: Removed the ones that wasn't used, added new ones that are interesting for us.
Sep 19 2018
I've made more tests now and made sure the only change is 69. It seems that 69 spends more time in creating the layout:
Layout : 1214.3 vs Layout : 1836.0
Couldn't see any diff in the metrics (that last green line is the change with orange -> white):
Only one run on that dashboard though.
Hmm I wonder .. checking the Chinese wiki, the increase in instability in metrics happened exactly when we pushed Chrome 69:
Looking at other wikis it seems pretty clear that 69 introduced something. Only change in the Docker container at that moment was upgrading to 69 (and FF to 62). The blue annotation line in the graph is when we updated to 69: