Maybe this should be broken up more, but here are a few ideas to investigate from our weekly relevance meeting. The idea is to get a better understanding of our space, and what it should be. One additional difficulty is this might need to be tested on both xgboost and lightgbm, as they vary.
* Experiment with variations in number of trees for small wikis, does current setting of 500 help? Is it better to have more trees with less leaves, or less trees with more leaves? This also needs to be experimented with on different dataset sizes. Training params: total_leafs, number of trees: num_leafs = total_leafs / num_trees
* Train the same data with the same hyperparameter space multiple (3, 5?) times to get an idea of expected variance
* Train with more hyperopt iterations (300? 500?), to see if continued search is beneficial.
* Expand the space searched for individual hyperparameters until it is clear that quality degrades at the edges. Some parameters seem to be "bumping up" against their limits.