Page MenuHomePhabricator

Move to bare metal servers for performance tests
Closed, ResolvedPublic

Description

In T203060 we could see that bare metal servers for hosting would give us more stable metrics over time. I would like us to think about how many servers and where we want to have them so we can check for the budget. When I did the test I used a Swedish company called Glesys and the server spec where:

mobo: x9scd-f
cpu: e3-1230
ram: 1x8gb ddr3
storage: 1x 480gb ssd
8 CPUs

One thing that is important is that we need to run on Ubuntu (not Debian) so we can use Chrome instead of Chromium for our tests.

And I think should run on something similar.

Today on AWS we have:
1 server running Graphite (and I think we could keep that)
3 servers running WebPagReplay tests and user journey tests using sitespeed.io direct against Wikipedia
1 server running a WebPageTest agent
1 small server running a WebPageTest server

For WebPageReplay servers I was thinking we could have in our data centre but I'm not 100% sure since I guess we need formalise the containers we run? Right now we just use the current I release with sitespeed.io, meaning there's no overhead building new containers.

WebPageTest I'm thinking we should close it down but we should discuss it see T302279.

Lest all try sync what and how we want to run these tests the next year.

Event Timeline

I think what I need to know if if we can have servers running in our dc on an isolated network, maybe you know this @dpifke ? Those servers needs to be able to connect to internet but we do not want them to be connected in any way with the rest of our servers.

We should talk through what we would hope to accomplish by not having them connected to the rest of the DC.

I think we want them puppetized and managed like the rest of the fleet. This would be a vast improvement over the current AWS setup, and require less effort on our part, by letting us leverage the existing infrastructure tooling.

We can talk with SRE about the network details, but I'm dubious that any "isolated network" setup would be any fewer hops between the test server and the front-ends, or (even if it is an extra hop or two) that those hops would reduce the fidelity of the test results in any way.

Also, from a practical standpoint, I think we're much more likely to get approval to add these if we DON'T require a bunch of custom network setup and ongoing maintenance/complexity burden.

The blog post is done since a couple of months (but not published), I think we can close this task anyway.