Build an AI that categorizes pages into a basic set of categories.
We should be able to process an article draft and automatically suggest categories. This could be used for curation new pages (e.g. stubs and otherwise new articles) and for routing new page creations toward subject matter experts who would be more interested in reviewing them (WikiProjects).
Given a version of an article, predict the likelihood that the article will eventually be tagged by a particular WikiProject within a mid-level category.
Output:
```
{
"medicine": 0.951,
"history": 0.342,
"biology": 0.522,
"chemistry": 0.233,
...
}
```
We'd need to set up a [multilabel classifier prediction model](http://scikit-learn.org/stable/modules/multiclass.html) that will be able to predict a set of output classes. It looks like sklearn's ]RandomForrestClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier) supports this by default.
**This task is done when** an initial set of tasks are added to the backlog scoping the first steps of this project.