The monthly article quality predictions have proven very useful. However, re-generating the data for new dumps is a time-consuming and highly manual process. There should be a job that runs periodically on the Analytics Cluster to keep this dataset up to date. Here's the one-off dataset: https://figshare.com/articles/Monthly_Wikipedia_article_quality_predictions/3859800 Here's an example of some fun research that is based on this data: https://commons.wikimedia.org/wiki/File:Demonstrating_the_Keilana_Effect_(OpenSym%2717).pdf