Page MenuHomePhabricator

Non-finalised input is added to the document when using the Anthy Japanese IME with VisualEditor
Closed, DeclinedPublic8 Estimated Story Points

Description

Non-finalised input is being added to the document when typing in Japanese using the Anthy IME in VisualEditor. This is causing two problems:

  1. You need to click undo many times to undo the input of one word, when instead you should be able to undo the input with one click.
  2. If you click somewhere else in the document before finalising the input, the unfinalised version becomes finalised. Instead, it should be erased and not appear in the undo history.

Please see this screencast for demonstrations of both of these issues. (Choose the original file rather than the preview, as the preview has synchronisation problems.)

My setup:
Ubuntu Linux 15.04
Firefox 39.0.3
IBus-Anthy 1.5.6
Anthy input settings: default (input mode "Hiragana", typing method "Romaji", conversion mode "Multiple segment")
The system language is set to English (US)
VisualEditor tested at https://wikimedia.github.io/VisualEditor/demos/ve/desktop-dist.html#!pages/simple.html

Event Timeline

MrStradivarius raised the priority of this task from to Needs Triage.
MrStradivarius updated the task description. (Show Details)
MrStradivarius added a project: VisualEditor.
MrStradivarius subscribed.

More about issue #2: I've noticed that this isn't the same across all Ubuntu apps. If you click somewhere else before finalising some input in LibreOffice and in the Ubuntu system menus, your input will be erased. However, if you do the same thing in the Firefox search bar or address bar, the unfinalised input will be finalised (the same behaviour as VisualEditor has currently). I'm guessing that LibreOffice and Ubuntu are correct here, but I'm not 100% sure.

Jdforrester-WMF edited a custom field.
Jdforrester-WMF moved this task from To Triage to TR3: Language support on the VisualEditor board.
Jdforrester-WMF added a subscriber: dchan.

I agree fixing 1 and 2 would be great. However I think it will be difficult or maybe even impossible, because from Javascript:

  1. There is no way to detect which IME is in use
  2. Candidates appear in the DOM and there is no way to distinguish them from committed text
  3. There is no reliable (cross-IME) event pattern that signals a commit (in particular, compositionend has no consistent meaning other than "something's happening").

One hole through which we might conceivably escape this prison is to allow browser native undo in certain circumstances. (However browser native undo can do crazy things at other times, e.g. restoring link text as blue underlined non-link text, so we'd have to be very careful).

At the moment our undo grouping is completely time based. Other editors appear to be at least partially based on wordbreaks, e.g. undo deletes words at a time. That could help with 1.

I agree that 2 sounds basically impossible with the information we get from the browser.

At the moment our undo grouping is completely time based. Other editors appear to be at least partially based on wordbreaks, e.g. undo deletes words at a time. That could help with 1.

That would be good if wordbreaks were easy to define for Japanese, but unfortunately they aren't. (That's T101917.)

I agree that 2 sounds basically impossible with the information we get from the browser.

That's not such a big problem - #1 is the more important from a usability perspective. #1 is not a showstopper either, but having it fixed would definitely save users some frustration.

Jdforrester-WMF claimed this task.
Jdforrester-WMF subscribed.

Regretfully, I'm going to decline this one as I don't think there's anything we can rationally do without breaking other language support. Merging transactions after the fact to fix #1 will fundamentally break our aspirations of providing real-time collaborative editing, and #2 is as said impossible.

In the spirit of #1, but in fact going the other way (more transactions, not fewer), I've created T120855: Consider splitting observed transactions on word boundaries.