The monthly article quality predictions have proven very useful. However, re-generating the data for new dumps is a time-consuming and highly manual process. There should be a job that runs periodically on the Analytics Cluster to keep this dataset up to date. Here's the one-off dataset: Here's an example of some fun research that is based on this data: