Page MenuHomePhabricator

Improve ORES articlequality feature extraction for images
Open, NormalPublic

Description

Currently, ORES wp10 uses an 'image_links' feature that counts only images in an article if they use the normal [[File:Something.jpg]] markup format. Images withing gallery tags are not counted, and lead images in infoboxes are not counted.

In both cases, counting those images would be a better reflection of the image count, in terms of article quality.

See for example this article: https://en.wikipedia.org/wiki/The_Appearance_of_Christ_Before_the_People

ORES thinks it has one image, instead of the six it actually has: https://ores.wmflabs.org/v3/scores/enwiki/?models=wp10&revids=782180234&features

Event Timeline

Ragesoss created this task.Nov 17 2017, 5:59 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
awight renamed this task from Improve ORES wp10 feature extraction for images to Improve ORES articlequality feature extraction for images.Sep 26 2018, 6:41 PM
Harej triaged this task as Normal priority.Apr 9 2019, 9:11 PM
Harej moved this task from Research & analysis to New development on the Scoring-platform-team board.
Harej moved this task from New development to Ready to go on the Scoring-platform-team board.