Page MenuHomePhabricator

allow TextCat to use multiple language model directories
Closed, ResolvedPublic

Description

Allows us to use WikiText-based models and query-text-based models without having to put them in one directory (which requires duplication and confuses provenance). Generalize to any number of directories. Expected outcome is improved recall and possible boost to precision, by identifying some languages for which we have no query-text-based models, but for which we have or can easily generate wiki-text-based models.

Update Perl and PHP versions of TextCat.

Event Timeline

TJones created this task.Oct 27 2016, 3:47 PM
Restricted Application added a project: Discovery-Search. · View Herald TranscriptOct 27 2016, 3:47 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
debt moved this task from needs triage to Up Next on the Discovery-Search board.Oct 27 2016, 8:38 PM

Change 320852 had a related patch set uploaded (by Tjones):
Allow TextCat to use multiple language model directories

https://gerrit.wikimedia.org/r/320852

Change 320852 merged by jenkins-bot:
Allow TextCat to use multiple language model directories

https://gerrit.wikimedia.org/r/320852

Deskana closed this task as Resolved.Dec 8 2016, 6:52 PM