I don't think we can do a full end to end test, getting the enwiki index along with the other language indexes into our hypothesis testing cluster is probably a bit too much to ask of it. We probably can though detect the language of zero result queries from enwiki and import the top 2-4 relevant indexes.
So basically:
- Extract some number of zero result queries from enwiki request logs (ideally enough samples so we have enough foreign language queries)
- Run all those queries against the language detection plugin and come up with a list of the most relevant indexes to import
- Import the relevant indexes to the hypothesis-testing cluster
- Run the queries through the language detector again, this time running them against the suggested indexes and report the results.
Or something like that, adjust as needed.