Page MenuHomePhabricator

Disable MinT translation tool on meta
Open, Needs TriagePublic

Description

https://meta.wikimedia.org/wiki/Meta:Requests_for_comment/Temporarily_stop_MinT_translation

TL;DR
The quality of its translations is extremely poor, even compared with common machine translation tools. What's worse is that this tool is available to any registered user and can be used to spam translations that are difficult to remove.

We hereby suggest temporarily disable this tool on meta, especially for translation.

Where are MinT suggestions shown on Meta?
The dialog to localize a message has a sidebar on the side (at the right in the image) where suggestions are provided. MinT suggestions are listed at the end of the list after suggestions from previous translations by other users (Translation Memory). You can see the "MinT" labelled suggestion in the screenshot below:

MinT suggestions on meta.png (424×957 px, 85 KB)

Event Timeline

Here is a case to examine;
(en) Original en thread as: ''[[$1|Add your name]], with $code, i.e. 4 tildes, as well as any comment you want to make.'';
(ja) MinT output en-ja as: "$1 の 名前 を 追加 し,$ コード,すなわち 4 の 字符,および 意見 を 提出 する こと が でき ます. " ほら.
In the output:

  • when MinT engine fails to analize the composition of the text in en, it splits each Japanese term (名前/追加/コード/すなわち/字符/および/意見/提出/する/こと/でき/ます) by inserting half-width space;
  • no Japanese punctuation is used anytime; English half-width punctuation is applied;
  • wiki mark up is ignored;
    • (a) italics _''_ is made into half width quotation mark _"_;
    • (b) square brackets as well as a bar are ignored so that link is destroyed;
    • (c) codes such as _$code_ is changed into _$ コード_, here again splitting dollar mark from the following part, on the contrary, it outputs $1 as $1;
    • (d) the letters following the dollar mark (code) is passed to translation and output in double byte ja word (コード);
  • adds a ja word (ほら) which does not correspond to the translation original.

I largely agree. I tried https://translate.wmcloud.org/ and encountered the same pattern of errors including weird puctuation and spacing in Japanese. (All English commas and periods seem to be left untranslated, and would need to be revised by human.) Here is the result for the pre-filled text about Jazz:

ジャズは19世紀後半から20世紀初頭にかけて,ニューオーリンズ州,ルイジアナ州,アメリカ合衆国のアフリカ系アメリカ人のコミュニティから生まれ,ブルースとラグタイムの根源を 持っている音楽ジャンルです. 1920年代から,それは伝統的な音楽と人気音楽における主要な音楽表現として認められ,アフリカ系アメリカ人とヨーロッパ系アメリカ人音楽の共同絆によって結びついています. ジャズはスイングとブルーノート,複雑なコード,コール&レスポンスのボカル,ポリリズムと即興で特徴づけられています. ジャズは西アフリカの文化や音楽表現と アフリカ系アメリカ人音楽伝統に根ざしています.

Lemonaka renamed this task from Disable MinT translation tool for en-ja on meta to Disable MinT translation tool on meta.Oct 9 2023, 11:19 PM

There are more problems translating other languages, now we request disable this translation tool, at least on meta.

In response to Commons and its mailing list to (details at the end of this post): subscribers/translators at: commons-l-owner@lists.wikimedia.org"
Commons:WMF support for Commons" is offered for translation, and good-intention but non-translator users would be tripped to produce mistakes via MinT.

Case: Language table confusion with other common errors.

Common errors:

  • Wiki markup is stripped off as with the output as "$(half space inserted)1 ";
  • punctuation appears in latin script, not in double byte ja symbols.
  • Peculiar errors;
  • non-ja characters are inserted (as in bold type as লোড): Google translate tells it means 負荷 (ja); while that term does not occur in the original text nor in the ja target;

MinT output:

  • $ 1 アップলোড ウィザード 改善: アップロード ウィザードのユーザー体験を向上させ,デザインの改善に焦点を当て,削除要求につながるメディアアップロードのリスクを最小限に抑えるプロジェクト.

Desirable output by human:

  • <nowiki> [[</nowiki>Commons:WMF_support_for_Commons/Upload_Wizard_Improvements|アップロード・ウィザード改善<nowiki>]]</nowiki>:アップロード・ウィザードのユーザー体験向上を目指すプロジェクトで、削除提案のリスクを招くメディアのアップロードを最小限に抑えるように設計改善に注力。

mailing list details: (subscription)

  • Date: Tue, 10 Oct 2023 11:34:00 +0200
  • From: "Luca Martinelli [Sannita@WMF]" <sannita-ctr@wikimedia.org>
  • Subject: [Commons-l] Your feedback needed: Upcoming design improvements to UploadWizard
  • To: commons-l@lists.wikimedia.org
  • Message-ID: <CAF0svn2cZ8ms7nsg+L1c39H+7B_UYvo-s-Q2L9bd+63P+weSTg@mail.gmail.com>

Pginer-WMF subscribed.

For context I added to the description a screenshot illustrating where MinT suggestions are shown:

Where are MinT suggestions shown on Meta?
The dialog to localize a message has a sidebar on the side (at the right in the image) where suggestions are provided. MinT suggestions are listed at the end of the list after suggestions from previous translations by other users (Translation Memory). You can see the "MinT" labelled suggestion in the screenshot below:

MinT suggestions on meta.png (424×957 px, 85 KB)

Can the MinMT suggestion be converted into something like a beta feature, in which MinT results will be shown to users who explicitly have agreed to test it?

There's another interesting bug.
interesting bug

The sentence needs to translate is

[[Affiliations_Committee/Resolutions/Recognition_of_Wikimedians_of_Japan_User_Group|Affiliations Committee/Resolutions/Recognition of Wikimedians of Japan User Group]]

The translation result got from MinT is

The following pages link to Wikipedia:

In fact, this sentence didn't need any translation, since they are just English, not Japanese.

I don't know why MinT give us such a result.

For context I added to the description a screenshot illustrating where MinT suggestions are shown:

Where are MinT suggestions shown on Meta?
The dialog to localize a message has a sidebar on the side (at the right in the image) where suggestions are provided. MinT suggestions are listed at the end of the list after suggestions from previous translations by other users (Translation Memory). You can see the "MinT" labelled suggestion in the screenshot below:

MinT suggestions on meta.png (424×957 px, 85 KB)

I've reviewed the problem again and find even this translation is totally wrong.
The translation given by MinT means "Meta is what?"
However, what we need to translate is "What Meta is not"