Page MenuHomePhabricator

Productionize monthly article quality prediction datasets

Authored By
Halfak
May 15 2018, 10:44 AM
Size
537 B
Referenced Files
None
Subscribers
None

Productionize monthly article quality prediction datasets

The monthly article quality predictions have proven very useful. However, re-generating the data for new dumps is a time-consuming and highly manual process. There should be a job that runs periodically on the Analytics Cluster to keep this dataset up to date.
Here's the one-off dataset:
https://figshare.com/articles/Monthly_Wikipedia_article_quality_predictions/3859800
Here's an example of some fun research that is based on this data:
https://commons.wikimedia.org/wiki/File:Demonstrating_the_Keilana_Effect_(OpenSym%2717).pdf

File Metadata

Mime Type
text/plain; charset=utf-8
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
5819327
Default Alt Text
Productionize monthly article quality prediction datasets (537 B)

Event Timeline