Goal: we want to be able to find smaller performance regressions and be able trust the metrics. Today we use AWS but there we have problems with noisy neighbours that makes our metrics change over time and causing false alerts. Moving the tests to bare metal servers will help us to avoid the noisy neighbour problem and by therefor find smaller performance regressions that we can trust.
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Peter | T307446 Move to bare metal servers for performance tests | |||
Resolved | Peter | T311980 Move tests from AWS to bare metal | |||
Resolved | Peter | T311981 Set up bare metal server at Hetzner for performance tests | |||
Resolved | Peter | T311983 Setup tests on the new bare metal machine | |||
Resolved | Peter | T311984 Remove AWS tests | |||
Resolved | Peter | T311985 Switch alerts to use bare metal servers metrics | |||
Resolved | Peter | T333524 Pin FFMPEG to specific CPU:s | |||
Resolved | Peter | T312203 Test running synthetic performance test at hetzner | |||
Resolved | Peter | T314752 Report number of running processes before and after tests | |||
Resolved | larissagaulia | T317392 Bare metal at supplier.io or in-house |
Event Timeline
During my vacation I moved my personal open source projects from running on a cloud provider to use Hetzner dedicated servers. The move was smooth and didn't take more time than one day or two. I also got rid off using S3, serving the data directly from one of our servers (that's something we also can do in the future). It's been running flawlessly since, however I haven't analyzed the stability in the metrics (I can see that it's better than the cloud provider I used to use but I haven't compared it with the numbers I got when running the POC earlier this year). Comparing those numbers would be a good start before we test out a providor.
Today I turned on all WebPageReplay tests that we run AWS so they also run on bare metal. I used the exact same configuration except that I hacked the start script on the bare metal server to change the Graphite reporting key, so it reports under baremetal. If this looks ok, I think this is a good first step, then we can move WebPageReplay tests to the bare metal server and turn off a couple of AWS servers.
Change 907715 had a related patch set uploaded (by Phedenskog; author: Phedenskog):
[performance/synthetic-monitoring-tests@master] Remove CPU throttling on mobile (moving to bare metal).
Change 907715 merged by jenkins-bot:
[performance/synthetic-monitoring-tests@master] Remove CPU throttling on mobile (moving to bare metal).
Change 907722 had a related patch set uploaded (by Phedenskog; author: Phedenskog):
[performance/synthetic-monitoring-tests@master] Remove config that now is default and remove emulated CPU throttle.
Change 907722 merged by jenkins-bot:
[performance/synthetic-monitoring-tests@master] Remove config that now is default and remove emulated CPU throttle.