Page MenuHomePhabricator

Design machine learning models to detect unsrouced statments needing citation
Closed, ResolvedPublic

Description

Together with Besnik from L3S, create a machine learning model based on the data collected in T186279 able to score statements according to the probability that they need a citation.

  • Compile a summary of the 'citation needed' rules
  • Derive multilingual NLP features from the rules above
  • Implement features
  • Train/test on positive/negative statements automatically extracted from articles
  • Train/test on larger scale data of different quality
  • Design an end-to-end framework based on deep learning on automatically collected data

Event Timeline

TODO: document on meta the DL framework we designed