Page MenuHomePhabricator

Add sentence segmenter feature
Closed, ResolvedPublic

Description

As the NMT models work on sentences, translating all sentences in a paragraph in parallel gives better performance. The APIs of translation models are defined in this way. CXServer currently does this segmentation before sending the content to MinT. But, for the usecase where cxserver is not needed, the segmentation should happen inside the MinT api, so that consumers of translation api can sent Paragraphs directly.

Event Timeline

Change 929682 had a related patch set uploaded (by Santhosh; author: Santhosh):

[mediawiki/services/machinetranslation@master] Sentence segmenter

https://gerrit.wikimedia.org/r/929682

Change 929682 merged by jenkins-bot:

[mediawiki/services/machinetranslation@master] Sentence segmenter

https://gerrit.wikimedia.org/r/929682

Change 945036 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update MinT to 2023-08-02-142037-production

https://gerrit.wikimedia.org/r/945036

Change 945036 merged by jenkins-bot:

[operations/deployment-charts@master] Update MinT to 2023-08-02-142037-production

https://gerrit.wikimedia.org/r/945036

Mentioned in SAL (#wikimedia-operations) [2023-08-03T06:31:50Z] <kart_> Updated MinT to 2023-08-02-142037-production (T338292)