Page MenuHomePhabricator

Productionize monthly article quality prediction datasets
Open, MediumPublic

Description

The monthly article quality predictions have proven very useful. However, re-generating the data for new dumps is a time-consuming and highly manual process. There should be a job that runs periodically on the Analytics Cluster to keep this dataset up to date.

Here's the one-off dataset:
https://figshare.com/articles/Monthly_Wikipedia_article_quality_predictions/3859800

Here's an example of some fun research that is based on this data:
https://commons.wikimedia.org/wiki/File:Demonstrating_the_Keilana_Effect_(OpenSym%2717).pdf

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@JAllemandou was working on this. Maybe there's already another task in place for it?

fdans lowered the priority of this task from High to Medium.
fdans moved this task from Incoming to Wikistats on the Analytics board.
Vvjjkkii renamed this task from Productionize monthly article quality prediction datasets to uxcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from uxcaaaaaaa to Productionize monthly article quality prediction datasets.Jul 2 2018, 4:36 PM
CommunityTechBot lowered the priority of this task from High to Medium.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.

ping us if any help is needed from us