Create regression test sets for use with Relevance Lab. 1K, 5K, and 10K sets for the top 10 wikis by query volume (en, es, de, fr, pt, nl, ja, ru, it, and pl).
Each larger set for a given wiki is a super set of the smaller ones, and the naming format is Wiki.Size.YearMonth .
These corpora will also be available for quick sub-sampling for other relevance lab tests.
For now they will reside in stat1002:~tjones/rel_lab/corpora/regression