Status update (September 28th, 2016)
September 28th, 2016
(This post was copied from https://lists.wikimedia.org/pipermail/ai/2016-September/000102.html)
Hey,
This is the 23rd weekly update from revision scoring team that we have sent
to this mailing list.
New development
- We implemented and demonstrated a linguistic/stylometric processing strategy that should give us more signal for finding vandalism and spam[1]. See the discussion on the AI list[2].
- As part of our support for the Collaboration Team, we've been producing tables of model statistics that correspond to set of thresholds[3]. This helps their designers work on strategies for reporting prediction confidence in an intuitive way.
Maintenance and robustness
- We had a major downtime event that was caused by our logs being too verbose. We've recovered and turned down the log level[4].
- We made sure that halfak got pings when ores.wikimedia.org goes down[5]
Datasets
- We created a database on Wikimedia Labs that provides access to a dataset containing a complete set of article quality predictions for English Wikipedia[6]. See our announcements[7,8,9].
- https://phabricator.wikimedia.org/T146335 -- Implement a basic scoring strategy for PCFGs
- https://lists.wikimedia.org/pipermail/ai/2016-September/000098.html
- https://phabricator.wikimedia.org/T146280 -- Produce tables of stats for damaging and goodfaith models
- https://phabricator.wikimedia.org/T146581 -- celery log level is INFO causing disruption on ORES service
- https://phabricator.wikimedia.org/T146720 -- Ensure that halfak gets emails when ores.wikimedia.org goes down
- https://phabricator.wikimedia.org/T106278 -- Setup a db on labsdb for article quality that is publicly accessible
- https://phabricator.wikimedia.org/T146156 -- Announce article quality database in labsdb
- https://lists.wikimedia.org/pipermail/ai/2016-September/000091.html
- https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)/Archive_149#ORES_article_quality_data_as_a_database_table
Sincerely,
Aaron from the Revision Scoring team
- Projects
- Subscribers
- None