We want to spread the usefulness of Language Identification (via TextCat) to non-Wikipedia wikis.
Rather than do a time-consuming manual analysis for each wiki project, we could do an A/B test on some/all projects in the same language using the default configs for the Wikipedia project in that language (for which analysis is done).
Such A/B tests would give us insight into whether the TextCat configs can be straightforwardly shared across projects in the same language. If so, it would help us be able to apply language detection to more of the long tail of wiki projects.