To get an idea of what to expect when we roll out learning to rank we need to run a load test. This can be based on the previous load testing work, T117714
Variations to run:
- (3) original speed, 150% playback speed and 200% playback speed
- (2) 100 tree and 500 tree models
- (2) ltr on enwiki and dewiki, and ltr on top 10 wikis by search volume
- (2) 1024 rescore window and 4096 rescore window
- (2) original retrieval query and simplified retrieval query using all field
In total thats 3*2*2*2*2 = 48 tests to run, and since each test is using 40 minutes of input data it takes roughly 40min+30min+20min = 90 min = 1.5 hours. This can be mostly automated, although someone should keep an eye on things to make sure we don't overload the cluster. Call it 2 hours per set of 3 speeds, and its about 32 hours worth of testing. Thankfully it's mostly hands off testing.