Page MenuHomePhabricator

Reduce or eliminate the need for the user to touch <translate> tags and unit markers
Open, HighPublic

Description

Lots of pages on mediawiki.org have "<translate>" and "<!--T:123-->" bits strewn about, making them hard to read and edit in source, and hard to edit in Visual Editor.

Translation should not rely on making large parts of pages uneditable!

Per discussion it looks like some small tweaks to VE's treatment of the extension tag may simplify things without introducing too much breakage, until a more VE-native solution is available.

To-do: add detail bugs and replace this one.


See also: use cases for which <translate> tags are considered better than structured translation: T116235: [Epic] CentralNotice translation should move closer to MediaWiki i18n standards and the code cleaned up

Related Objects

Event Timeline

brion created this task.Apr 1 2016, 3:48 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 1 2016, 3:48 PM

As far as I can predict, that markup is going to remain as long as wikitext is the primary format for content.

See T55974: Create a VisualEditor plugin tool to add/edit translations and translation variables (Translate extension) for having Visual Editor to support translatable pages in user friendly manner.

Once HTML content is better supported, we can consider using some heuristics to automatically detect translatable parts (with option for manual user invention where it goes wrong). A blocker that needs to be resolved for that is that Translate extension does not currently support translation of anything but plain text (and SVG files with TranslateSVG).

brion added a comment.Apr 1 2016, 4:22 PM

As far as I can predict, that markup is going to remain as long as wikitext is the primary format for content.

That seems unfortunate, as it's very difficult to use both in wikitext editing and VE. :(

See T55974: Create a VisualEditor plugin tool to add/edit translations and translation variables (Translate extension) for having Visual Editor to support translatable pages in user friendly manner.
Once HTML content is better supported, we can consider using some heuristics to automatically detect translatable parts (with option for manual user invention where it goes wrong). A blocker that needs to be resolved for that is that Translate extension does not currently support translation of anything but plain text (and SVG files with TranslateSVG).

_nod_

One particularly bad example I found was a translate tag that spanned a section boundary, like this:

...
<translate>
<!--T:55-->
This usage is deprecated, and developers of existing extensions and skins should start [[#Migration for extension developers|migrating to the new format]].

== Migration for system administrators == <!--T:56-->

<!--T:5-->
Previously your <tt>LocalSettings.php</tt> would include something like:
</translate>
...

How it looks in VE:

It seems to me that a translation block shouldn't be able to span across multiple paragraphs, and certainly shouldn't be able to span from the end of one section across a section header into another section.

However as an editor I find I have little insight into what will break if I just rip out the misplaced tags and move them to reasonable locations around individual paragraphs.

What can I do as an editor to help, other than removing all the <translate> and <!--T:123--> bits and recommending people use a different translation tool that's more editor-friendly?

I wonder what is the different translation tool that is more editor friendly? Content Translation is not available on mediawiki.org yet and only supports one-time translation.

You can play with the tags, see existing documentation: https://www.mediawiki.org/wiki/Help:Extension:Translate

I think that perhaps VE could just treat those tags as plaintext for now if that would cause less problems. I do not understand why tags spanning sections would be a problem though, especially when they never appear in rendered output.

brion added a comment.Apr 1 2016, 4:38 PM

You can play with the tags, see existing documentation: https://www.mediawiki.org/wiki/Help:Extension:Translate

Thanks, will try to decipher the code...

I think that perhaps VE could just treat those tags as plaintext for now if that would cause less problems.

That would probably help a great deal.

I do not understand why tags spanning sections would be a problem though, especially when they never appear in rendered output.

It's confusing as heck when trying to edit, and can easily lead to mismatched tag pairs during page refactoring.

I do not understand why tags spanning sections would be a problem though, especially when they never appear in rendered output.

It's confusing as heck when trying to edit, and can easily lead to mismatched tag pairs during page refactoring.

Translate should complain and prevent saving if it detects mismatched translate tags.

brion added a comment.Apr 1 2016, 4:43 PM

@Nikerabbit I see no information on that page about the <!--T:\d+--> items. Is it safe to remove them?

Is it safe to refactor large multi-paragraph <translate>...</translate> chunks into multiple single-paragraph or sub-paragraph <translate>...</translate> chunks?

Ok... https://www.mediawiki.org/wiki/Help:Extension:Translate/Page_translation_administration#markup seems to indicate that any change whatsoever will break stuff. Which makes it very ....... very.... fragile.

greg added a subscriber: greg.Apr 1 2016, 4:48 PM
matmarex added a subscriber: matmarex.

@Nikerabbit I see no information on that page about the <!--T:\d+--> items. Is it safe to remove them?

That would lose connection to all existing translations.

Is it safe to refactor large multi-paragraph <translate>...</translate> chunks into multiple single-paragraph or sub-paragraph <translate>...</translate> chunks?

Yes to multi to single, as long as the T-comments are preserved. For sub-paragraphs you would again disconnect all existing translations.

Ok... https://www.mediawiki.org/wiki/Help:Extension:Translate/Page_translation_administration#markup seems to indicate that any change whatsoever will break stuff. Which makes it very ....... very.... fragile.

Not any change, the page should explain how it works. The stuff is split into units delimited by translate tags or empty lines. The T-comments identify the units so that they can be changed or moved without losing connection to existing translations.

brion added a comment.Apr 1 2016, 5:01 PM

Hmm, there's a lot of weird recommendations in there too, such as recommending to put multiple levels of markup together. For instance this looks so obviously wrong:

Wrong:

== <translate>Culture</translate> ==

Wrong:

<translate>== Culture ==</translate>

Suggested segmentation:

<translate>
== Culture ==

Lorem ipsum dolor.
</translate>

The first "wrong" recommendation looks clearly right.

The second looks right according to the doc next to it ("Headers can in principle be tied to the following paragraph, but it is better to have them separated.") but is labeled as wrong.

The "right" one looks clearly wrong, leaving a stray "<translate>" at the top of the previous section that won't appear within the section during section editing in the source editor. (This might contribute to the weird spanning problem, in that it encourages people to add more stuff at the end of the previous section, after the <translate> opener but before the == line.)

The suggested one is based on the practical reason that it is the only one which does not break wikitext section editing completely.

brion added a comment.Apr 1 2016, 5:26 PM

The suggested one is based on the practical reason that it is the only one which does not break wikitext section editing completely.

https://www.mediawiki.org/w/index.php?title=Testpage12354&diff=2089226&oldid=2089225 seemed to work fine using the first "Wrong" option. Can you tell me what doesn't work with this?

I marked the page for translation, now you can see it doesn't work anymore.

brion added a comment.Apr 1 2016, 5:34 PM

I marked the page for translation, now you can see it doesn't work anymore.

You broke the markup:

== <translate><!--T:3-->
Test2</translate> ==

It does, indeed, now not work. Why did you add the newline?

Note I can't even remove the incorrect newlines manually because the extension claims "Translation unit markers in unexpected position." Seems like some pretty bad breakage in the translate code:

  1. should not add newline
  2. should not require a newline
matmarex removed a subscriber: matmarex.Apr 1 2016, 5:39 PM
brion added a comment.Apr 1 2016, 5:59 PM

I'd like to apologize to @Nikerabbit, my tone's been not cool on this thread. Taking out my frustration on you is not OK.

I think we can make some improvements in the short term to get VE and Translate to play a little nicer until we have something more VE-native to migrate to... Will break out into smaller bugs, probably on the VE end. If we basically treat a <Translate> like a <div> I think it should mostly not explode... I hope. :)

brion renamed this task from <translate> extension is a usability nightmare for editing to <translate> extension usability issues for editing.Apr 1 2016, 6:01 PM
brion updated the task description. (Show Details)
Krenair added a subscriber: Krenair.Apr 1 2016, 6:02 PM
Nikerabbit added a comment.EditedApr 1 2016, 6:05 PM

Brion, I understand very well why you feel frustrated about the markup and it might be actually helpful that you raise awareness of this issue.

All help is much appreciated as our team is very small and trying hard to find a balance between new feature development and maintaining and supporting existing features.

I marked the page for translation, now you can see it doesn't work anymore.

To clarify, the edit you saw was done automatically by the tool available to translation admins when I registered the page for translation. The tool automatically adds the T-comments and whitespace change you can see in the diff.

I'd like to apologize to @Nikerabbit, my tone's been not cool on this thread. Taking out my frustration on you is not OK.

I don't think an apology is warranted. The Translate extension's syntax and general behavior is rage-inducing. Highlighting this frustration is completely legitimate, in my opinion.

These various usability issues contribute to why I'm so wary of seeing the Translate extension enabled on additional Wikimedia wikis. Wikitext is already scary and painful enough without this extension.

Danny_B added a subscriber: Danny_B.

I mentioned this in-person to Brion, but sharing for a wider audience:

Ultimately, I don't think a string-based translation annotation system is ever going to be fully compatible with a DOM-based editor. We can add more and more hacks to try to paper over the differences, but there will always be instances and edge cases which are not plausibly fixable.

Niklas is entirely right; we need to work at full-speed on the proper DOM-level fragment concept in MediaWiki so that we can make translation a first-tier feature of MediaWiki. In the mean-time, we're left with a very unsatisfactory situation.

I don't think an apology is warranted. The Translate extension's syntax and general behavior is rage-inducing. Highlighting this frustration is completely legitimate, in my opinion.

This contribution is not useful, Max. If you do not wish to contribute in a positive way to this discussion, I suggest you spend your time on other things.

I think that perhaps VE could just treat those tags as plaintext for now if that would cause less problems.

That would probably help a great deal.

Can a VisualEditor/Parsoid person split this actionable item to its own task, please?

I think that perhaps VE could just treat those tags as plaintext for now if that would cause less problems.

That would probably help a great deal.

Can a VisualEditor/Parsoid person split this actionable item to its own task, please?

It's not actionable because it's already the case.

brion added a comment.Apr 2 2016, 8:51 AM

I think that perhaps VE could just treat those tags as plaintext for now if that would cause less problems.

That would probably help a great deal.

Can a VisualEditor/Parsoid person split this actionable item to its own task, please?

It's not actionable because it's already the case.

Well, they're treated like extension blocks currently; you can edit the text within them (one block at a time) through a popup dialog but it's awkward. :)

I'm a bit unsure whether we need a full interface on T55974 or if we just need the <translate>...</translate> blocks to be treated as editable DOM spans rather than aliens that block editing.

Isarra added a subscriber: Isarra.Apr 3 2016, 8:00 AM
Isarra added a comment.Apr 3 2016, 8:05 AM

These various usability issues contribute to why I'm so wary of seeing the Translate extension enabled on additional Wikimedia wikis. Wikitext is already scary and painful enough without this extension.

This. I get that there are technical issues too, but it's a serious problem. Would it be possible to get some Design input here on possible other approaches?

In T130567#2201177, I raised the notion that our issues with translation extension markup may be symptomatic of a much deeper issue to resolve, where we have a lot of tradeoffs to balance in pursuit of a more ideal solution.

@Nikerabbit defends the tradeoffs in the existing solution quite well, and more to the point, it's deployed, solving a problem and hasn't killed us yet. It's quite possible to make an argument that it is working well enough for us now, and that we have bigger problems we should solve.

I hope this doesn't introduce stop energy toward good incremental improvements that make <translate> markup more robust and simultaneously makes editing support smoother and easier. Is this an issue that has solutions close to the surface, do useful changes in this area require thinking about deeper parts of the system?

In the email thread I suggested to start prototyping an alternative solution that does not require explicit <translate> tags.

From Translate's end, what is needed:

  • Array of source text units to translate
  • A method that can take the source page (or information extracted from it) and translated array, to create a translation page composed of the translations.
  • A continuity in the array keys so that if edits are done the page, the keys should not change if paragraphs are moved or changed slightly in content.

Current implementation does the above by <translate> tags and T-comments, by pre-parsing the wikitext before the parser gets to it. The new system should ideally use heuristics with human augmentation. It can be based either on wikitext or parsoid output, but ideally at least for beginning, the translatable parts would be converted to wikitext as primary storage format. Support for visual translation can be added later incrementally, in my opinion.

Anyone trying to build the heuristics should look at Special:PagePreparation and the documentation at https://www.mediawiki.org/wiki/Help:Extension:Translate/Page_translation_administration. To get started, something like this should do:

  1. One section for each heading
  2. One section for each paragraph
  3. One section for each image caption
  4. One section for each list item (this would be an improvement over the current system actually)

This could then be augmented by the user in VisualEditor and wikitext by marking some elements as not translatable or translatable (e.g. make it possible to change image name for localisation). Parsoid would likely to be used to keep track of each part and Special:PageTranslation might need additional UI to correct mappings (old to new) when heuristics fail to detect changes properly.

Nemo_bis updated the task description. (Show Details)Apr 20 2016, 6:15 AM

Given the shift in focus of this report, I propose to change the summary to "Invent automatic segmentation in translation units without <translate> tags" aka the dark magic solution.

Well, they're treated like extension blocks currently; you can edit the text within them (one block at a time) through a popup dialog but it's awkward. :)

I'd still like a separate report for this. There seems to be some low hanging fruit.

Given the shift in focus of this report, I propose to change the summary to "Invent automatic segmentation in translation units without <translate> tags" aka the dark magic solution.

That can be one of the blockers for this task. It alone is not sufficient to solve the problem this task is about. Keep this task as tracker.

Nikerabbit triaged this task as High priority.Apr 20 2016, 1:07 PM

As a workaround, wouldn't it be possible to have an option to automatically enable translation on all pages in a namespace, without the need for <translate> tags?
I know this would not be suitable for a lot of wiki but it can help small wikis (including mine) that want to translate all of their content.

(I have no idea what is on topic in this report, so please forgive me if I'm going off topic.)

As a workaround, wouldn't it be possible to have an option to automatically enable translation on all pages in a namespace, without the need for <translate> tags?

The translate extension would still need to add unit markers.

I know this would not be suitable for a lot of wiki but it can help small wikis (including mine) that want to translate all of their content.

We certainly need to facilitate the mass-addition of pages for translation. Our first step was with page migration tools.

On a wiki where translation is needed only towards a handful languages or less, I can imagine a need to reduce the work of the translation admins even at the cost of additional work for translators. I think a first step doesn't need to be especially complex: we could add a special page which mass-marks pages for translation in a given namespace, adding <languages/>\n<translate> at the beginning and </translate> at the end on each of them and allowing to confirm or not.

jhsoby added a subscriber: jhsoby.Apr 13 2017, 7:59 PM
Nemo_bis renamed this task from <translate> extension usability issues for editing to Reduce or eliminate the need for the user to touch <translate> tags and unit markers.Jun 24 2017, 8:55 PM
planetenxin added a comment.EditedAug 17 2018, 7:39 AM

I added this issue to MediaWiki Stakeholders' Group/TechConf Input:

https://www.mediawiki.org/wiki/MediaWiki_Stakeholders%27_Group/TechConf_Input#Improved_Translate_-_VisualEditor_integration

Please add your endorsement if you like to make this issue more visible.