Page MenuHomePhabricator

Pick a section: Arrange missing sections in the order of their appearance
Closed, ResolvedPublic

Description

Currently, inside "Pick a section" step of Section Translation application, the missing sections are being presented in a list, in the same order as we get them from cxserver. However, this order is arbitrary and doesn't correspond to the order of the appearance of each section inside the source article.

More specifically, in the example of the Moon article, we can see that the first section in the missing sections list is "Formation", while the first section in the actual article is "Name and etymology". This can be confusing to the users that would expect the sections to be displayed in the order of their appearance.

image.png (665×377 px, 49 KB)
image.png (853×289 px, 80 KB)

Event Timeline

Change 721512 had a related patch set uploaded (by Nik Gkountas; author: Nik Gkountas):

[mediawiki/extensions/ContentTranslation@master] Pick a section: Display sections in the same order as in article

https://gerrit.wikimedia.org/r/721512

Change 721512 merged by jenkins-bot:

[mediawiki/extensions/ContentTranslation@master] Pick a section: Display sections in the same order as in article

https://gerrit.wikimedia.org/r/721512

@Pginer-WMF most of the sections shown to translate already exist in the spanish article.
is there a way to fix this or we should ignore?

Screenshot 2021-10-18 at 17.06.37.png (1×1 px, 220 KB)

Screenshot 2021-10-18 at 17.06.47.png (1×824 px, 134 KB)

This is sadly a known issue about section mapping. We are aware about it and there is even a plan about improving it using the section alignment API (T270485), but this is not yet ready for our purposes.

@Pginer-WMF most of the sections shown to translate already exist in the spanish article.
is there a way to fix this or we should ignore?

Section mapping is not a perfect process, but we have plans for improving it, and specific examples are very useful for it (thanks!). I'm capturing examples of issues with section mapping in this ticket: T283817: Compile examples of section mapping issues to use as a benchmark for future improvements
I added the example of the "Physical characteristics" which should map "Características físicas" since it is the exact translation. For other cases such as ("Observation and exploration") that map to two different sections ("La observación lunar" and "La exploración lunar") it is not clear what would be the desired mapping. Feel free to add or suggest specific cases to be added that you think are relevant (for this case, and others you may encounter in the future).

In addition to the improvements @ngkountas pointed to, we also have T276214: Improve section mapping by integrating MT which is still incomplete. @santhosh pre-filled some translations to the database for popular section titles. But the system is still not checking live the translation in cases where it is not present in the database (and adding the new translations to it). I think that would solve the above mentioned case since the translation that Google provides for the English section ("Physical characteristics") matches perfectly the section name in Spanish ("Características físicas"). So I expect these kind of cases to be resolved when such task is completed.

Pginer-WMF triaged this task as Medium priority.Oct 26 2021, 2:44 PM