The beta cluster is currently unsuitable for testing code for many performance problems, because it is all on VMs. It generally won't tell you about cachebusting either, for several of our caching layers.
Several people are working on Vagrant roles that will help developers test in more realistic environments in dev environments, with million-row databases and warm caches. But CPU and other constraints won't be realistic.
So - as one can see in https://www.mediawiki.org/wiki/Performance_profiling_for_Wikimedia_code - right now, many interactions we can only predict roughly until the code hits production.
It is not possible to have a testing cluster that exactly mirrors production. And of course it is the developer's responsibility to know the constraints of the production system and know how her code exercises those systems. And with heterogeneous deployment, it's *possible* to notice some problems while they're only affecting less-trafficed wikis. But some people have expressed interest in creating a more realistic environment to test/predict how efficient code will be when deployed to very-high-traffic wikis, especially with unique configurations.
As a service developer I want to be more confident that my service will stand up to the load I expect to have placed upon it.
Currently this seems to be done in an ad-hoc manner and in some cases simply skipped. For example recently we tested the termbox service with some custom Locust scripts. Without a place to run these form it ended up being done from a developer laptop making the test hard to reproduce and increasing the barrier to entry.
It would be awesome to have a service that:
- Can generate traffic to my service
- It possible to configure some range of request content/parameters to fully exercise it
- can be run near the service in question to make latency realistic
It would be even cooler but non-essential if it could:
- have a feature for recording and replying prod traffic at various speeds
- show pretty statistics for the testing
Perhaps this could be done by containerising an off the shelf load-testing solution and providing a chart for running.