Page MenuHomePhabricator

Rewrite draft topic scripts to fetch linked pages and prepare training data
Open, LowPublic

Description

The scope should be kept minimal here. We need to rewrite the pipeline pieces which fetch all pages with embedded wikiproject templates, and the code to generate a list of revisions to train on. Previous code was difficult to run repeatably due to a dependency on PAWS, and it seems we were training on a random talk page revision, when we should have been looking for the first revision of the content page linked to the talk page.

Event Timeline

Vvjjkkii renamed this task from Rewrite draft topic scripts to fetch linked pages and prepare training data to ptdaaaaaaa.Jul 1 2018, 1:12 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot raised the priority of this task from High to Needs Triage.Jul 3 2018, 1:57 AM