Page MenuHomePhabricator

Pre-fill frequent section titles in the section mapping database
Closed, ResolvedPublic

Description

Section translation relies on identifying equivalent sections across articles in different languages. Currently the database used for section mapping is based on previously translated articles with Content Translation. Machine Translation (MT) could help to support more mappings (T276214), and a first step could be to expand the current database with frequent section titles translated with MT that may be missing in the current database.

This ticket proposes to extract frequently used section titles from cx corpus and use MT on the top 200 titles.

Event Timeline

Pginer-WMF assigned this task to santhosh.
Pginer-WMF created this task.

This was completed as part of the work in the parent task. More details in T276214#7032541

Pginer-WMF triaged this task as Medium priority.May 27 2021, 1:58 PM