Page MenuHomePhabricator

Script to generate questions from poor readability scores
Closed, ResolvedPublic

Description

From the articles in the backlog category for articles needing copy-edit, choose the ones having low Flesch-Kincaid (FK) readability scores and high page view counts (T140568). Create questions out of the chosen articles.

Event Timeline

prnk28 created this task.Jul 17 2016, 1:18 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJul 17 2016, 1:18 PM
prnk28 updated the task description. (Show Details)Jul 17 2016, 3:51 PM
prnk28 updated the task description. (Show Details)Jul 17 2016, 8:37 PM

The script extracts articles in articles needing copy edit. It then computes the Flesch-Kincaid readability scores as well as gets the page view counts of them. It then standardizes each of these two adds them to get the final contribution. Final rankings are based on high page view counts and poor readability scores. Top 20% are retrieved and made into questions.
Commit is here.
Related python scripts are:
copy_edit.py - main script that does the computations and generates questions
syllables_en.py - helper script to get syllables in a piece of text
utils.py - helper script to get words in a sentence, syllable count, sentence count, etc
copy_edit_ranking.pkl - generated pickle file of final article rankings. This is stored in the form of an ordered dict like:
dict[title] = [link, pageview, fk score, added standardized score]

prnk28 closed this task as Resolved.Jul 29 2016, 6:21 PM