Page MenuHomePhabricator

Improve MinT support for rich text
Open, HighPublic

Description

Machine translation models used by MinT operate with plain text. However, Wikipedia content contains links, references, and styling adjustments such as bold or italics that we want to preserve as contents are translated. As part of the Content Translation work approaches to reapply styling were developed and these have been ported to MinT (T341478).

With the current approach some rich text elements may be misplaced or disappear. A reliable system to translate rich text content becomes more relevant as we consider exposing MinT to wikipedia readers. This ticket will capture work for the exploration of better approaches to support rich text translation and issues that can be used as test cases to check that support has improved.


Related tickets:

Event Timeline

Change 987786 had a related patch set uploaded (by Brouberol; author: Btullis):

[operations/deployment-charts@master] Add helmfile deployments for Superset

https://gerrit.wikimedia.org/r/987786

Change 987786 merged by Brouberol:

[operations/deployment-charts@master] Add helmfile deployments for Superset

https://gerrit.wikimedia.org/r/987786

Change 1007307 had a related patch set uploaded (by Santhosh; author: Santhosh):

[mediawiki/services/machinetranslation@master] Improvements for rich text adaptation - references

https://gerrit.wikimedia.org/r/1007307

Change 1007568 had a related patch set uploaded (by Santhosh; author: Santhosh):

[mediawiki/services/machinetranslation@master] Improvements for rich text adaptation - repeated annotations

https://gerrit.wikimedia.org/r/1007568

Change 1007307 merged by jenkins-bot:

[mediawiki/services/machinetranslation@master] Improvements for rich text adaptation - references

https://gerrit.wikimedia.org/r/1007307

Change 1007568 merged by jenkins-bot:

[mediawiki/services/machinetranslation@master] Improvements for rich text adaptation - repeated annotations

https://gerrit.wikimedia.org/r/1007568

Change 1012988 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update MinT to 2024-03-20-072303-production

https://gerrit.wikimedia.org/r/1012988

Change 1012988 merged by jenkins-bot:

[operations/deployment-charts@master] Update MinT to 2024-03-20-072303-production

https://gerrit.wikimedia.org/r/1012988

Mentioned in SAL (#wikimedia-operations) [2024-03-20T08:59:30Z] <kart_> Update MinT to 2024-03-20-072303-production (T353791, T340956)