To get an idea of what to expect when we roll out learning to rank we need to run a load test. This can be based on the previous load testing work, T117714
Tests to run:
baseline 100%, 150%, 200%
Variations to run:
100 tree model, 10 features, 1024 rescore window* (3) original speed, enwiki and dewiki: 1150% playback speed and 200% 150% 200%playback speed
100 tree model, 10 features, 4096 rescore window, enwiki* (2) 100 tree and dewiki: 100% 150% 200%500 tree models
100 tree model, 10 features, 1024 rescore window* (2) ltr on enwiki and dewiki, and ltr on top 10 wikis by volume: 100% 150% 200%search volume
100 tree model, 10 features, 4096 rescore window, top 10 wikis by volume: 100% 150% 200%
500 tree model, 10 features, 1024 rescore window, enwiki and dewiki: 100% 150% 200%* (2) 1024 rescore window and 4096 rescore window
500 tree model, 10 features* (2) original retrieval query and simplified retrieval query using all field
In total thats 3*2*2*2*2 = 48 tests to run, 4096 rescore window,and since each test is using 40 minutes of input data it takes roughly 40min+30min+20min = 90 min = 1.5 hours. enwiki and dewiki: 100% 150% 200%
500 tree model,This can be mostly automated, although someone should keep an eye on things to make sure we don't overload the cluster. 10 featuresCall it 2 hours per set of 3 speeds, 1024 rescore window,and its about 32 hours worth of testing. top 10 wikis by volume: 100% 150% 200%Thankfully it's mostly hands off testing.
500 tree model, 10 features, 4096 rescore window, top 10 wikis by volume: 100% 150% 200%