This project is output 4.3 of the Audiences 2018–19 annual plan:
Instead of implementing programs that attempt to affect all wikis at the same time, it is common for a given Audiences program to focus just on groups of wikis, such as mid-size wikis, or large wikis. Given that we focus our work on groups of wikis, we should be able to report out using those groupings. The output here are evolving sets of segmentations that classify different wikis into groupings relevant for the Audience department's work. These will be used to align strategic planning, program focus, and reporting on Audiences department impact -- making it possible to report out using the same groupings as we use in our daily work.
See also the project brief [Wikimedia Foundation only].
Timeline / work plan
Time constraint: As this is an annual plan goal, it needs to be finished by June 2019 at the latest. However, it should be done much earlier than that: it's a strategic priority, and the the sooner it's completed, the sooner people can incorporate it into their thinking.
Phase 1 ✓
Phase 2 (T221563)
Recommend a standard set of key dimensions with standard classes for each (for example, monthly active editors might be a dimension, with low being 0–49, medium being 50–499, and high being 500+). We don't want too many dimensions (6 is about right) or too many classes per dimension (3-4 is about right).
Phase 3 (T203033)
Use some unsupervised learning to try to cluster the wikis into meaningful groups which we can name, describe, and make the standard groups for understanding our wikis. The results are uncertain, because it's hard to predict whether unsupervised learning will have meaningful results, but there's only one way to find out!