Page MenuHomePhabricator

VisualEditor: Cannot move cursor backwards on ligature cluster on Chrome
Closed, ResolvedPublic

Description

Browser: Chrome any version.

  1. Open VE
  2. Type or copy-paste സന്തോഷ്
  3. Move the cursor to the end of that word.
  4. Press left arrow to move the caret backwards.
  5. In the second keypress onwards cursor is stuck.(like സന്തോ|ഷ് )

Expected: cursor keep moving to left.

Observed. cursor stuck.


Version: unspecified
Severity: normal

Details

Reference
bz51596

Event Timeline

Debug Information:

Grapheme breaks as per unicodejs:
0: "സ"
1: "ന്"
2: "തോ"
3: "ഷ്"

Ligature break 1 and 2 logically wrong because they form a single cluster ന്തോ. You cannot place a cursor in between this cluster. It is a single syllable too. Firefox allows you to place the cursor at all of these breaks anyway. Infact Firefox allows you to place the cursor even inside ligature programatically. Chrome allows cursor positioning only at logical positions.

In this specific case. VE is asking Chrome to place the cursor at end of grapheme break #1(ന്). but Chrome does not obey it and place it at the end of #2 (തോ). This repeats in every left cursor movement and it looks like cursor is stuck at the end of തോ.

In my quick observation those logical positions does not match with the grapheme cluster boundary specification of Unicode (www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries).

That causes lot of inconstancy in the offset known to the data model of VE and the actual offset appearing in browser. It will lead to many unexpected behavior of cursor positioning and text selections.

There's code to address this bug in the following patch, which is due to go live on mediawiki.org by 13 September 2013:

https://gerrit.wikimedia.org/r/#/c/82858/

It does not fix this completely, but it does make it possible to cursor left through the cluster with repeated consecutive keypresses.

There's code in progress to fix this in gerrit 80689 which is currently a work-in-progress.

Change 80689 had a related patch set uploaded by Divec:
DONTMERGE:Revert model to use simple UTF-16 code units

https://gerrit.wikimedia.org/r/80689

Change 80689 merged by jenkins-bot:
Revert model to use simple UTF-16 code units

https://gerrit.wikimedia.org/r/80689