Page MenuHomePhabricator

Double-clicking a word selects several non-Latin words
Closed, ResolvedPublic

Description

In most web browsers and text editors double-clicking a word usually selects one word. In the Visual Editor this works for Latin words, but not for words in other scripts - i tried:

  • Russian: Мой дядя самых честных правил
  • Devanagari: कोणीही घडवू शकेल असा हा मुक्त ज्ञानकोश आहे
  • Georgian: რომლის განმარტებითაც იგი ბავშვების

In all cases double-clicking a word selected the whole sentence.

I didn't check the code, but my guess is that the word boundary algorithm only understand the Latin script.


Version: unspecified
Severity: normal

Details

Reference
bz33127

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:02 AM
bzimport set Reference to bz33127.
bzimport added a subscriber: Unknown Object (MLST).
Amire80 created this task.Dec 14 2011, 7:53 PM

Bug 33128 may be related.

The word-boundary detection is done using a regular expression, currently it's:

/([ \-\t\r\n\f])/g

We will need this to be more sophisticated for different scripts, and also need some help from people more familiar with how word breaks can be programmatically detected in these scripts.

I am curious - is it really impossible to use the host browser's word-boundary algorithm? Or better yet, to use the host browser's behavior for double-clicking a word?

Adding Santhosh, who may have advanced knowledge about word boundaries.

(In reply to comment #2)

We will need this to be more sophisticated for different scripts, and also need
some help from people more familiar with how word breaks can be
programmatically detected in these scripts.

I wish that could be implemented on Chinese as well (just wish). Few softwares (but there're still some even open-source ones) can do this since there's no clear boundary.

Fixed in new version of VE through using native implementation (which is much easier than re-implementing, as discussed!).

Mass-moving old VisualEditor tickets to the VE product. Search for this message to mass-delete bugmail.

Noting bugs closed in the 2012-10-15 release.