Page MenuHomePhabricator

Search box loads wrong suggestions when input CJK characters in multiple times
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Install a Chinese IME (pinyin is used below)
  • Open Chinese Wikipedia (this also happens at least in Commons)
  • Click the search box near the logo, and type some Chinese words
    • e.g. "上海中心大厦", first type "上海" (shanghai), then "中心" (zhongxin), then "大厦" (dasha)
    • This bug only happens when the words are typed (not pasted).

What happens?:
When the Chinese IME is active, only the current candidate text is sent in /w/rest.php/v1/search/title?q=%s requests. The text before it is ignored.

For "上海""中心""大厦", when "中心" is typed, the suggestions are based on word "中心".

Both Vector 2022 and Minerva are affected. The issue mainly affects Chrome. With Firefox, only the candidate text is sent during IME composition, but the full query is correctly sent after the IME exits.

What should have happened instead?:

The full input box content should be used for search suggestion regardless of IME usage. A change in TypeaheadSearch (gerrit #1217251) might have caused this regression.

https://domeventviewer.com/key-event-viewer.html might be of help in visualizing the DOM events.

Software version: 1.46.0-wmf.14

Other information:

  • Browser: Chrome 145 / Windows 10
  • IME: Microsoft Pinyin

Event Timeline

TJones removed the point value 2 for this task.Mon, Feb 9, 5:57 PM

For non-Chinese speakers, you can see this on English Wikipedia if you enable an IME as your input method. I'm using Pinyin on a Mac.

As I type "George", the IME is trying to give me suggestions, including the Latin text "George", and autocomplete is making suggestions based on "George".

Screenshot 2026-02-09 at 1.05.29 PM.png (370×1 px, 114 KB)

When I go on to the next word, "Washington", the IME gives me suggestions, including "Washington", but autocomplete is also giving suggestions only based on the word "washington":

Screenshot 2026-02-09 at 1.05.45 PM.png (378×1 px, 118 KB)

If I select "Washington" and type another space I get the expected suggestions based on the whole query:

Screenshot 2026-02-09 at 1.05.52 PM.png (370×1 px, 96 KB)

There's no plausible way for the autocomplete backend to search on only part of the query string, so it seems that the UI is sending partial queries to the backend.

Based on a little research, IMEs generate additional javascript events, compositionstart, compositionupdate, and compositionend.

I foud this random online tool to be helpful for me in understanding what's going on. While typing "george" (in the IME) the UI event data has the partially complete word you've typed in it:

Screenshot 2026-02-09 at 1.17.36 PM.png (752×2 px, 277 KB)

My uneducated guess is that one or more of the composition... events are triggering the code to send the event data field (which only has the current word the IME is working on) instead of the full input field—but that is way outside my area of expertise.

Interestingly, the Minerva skin has the same behavior, but the old Vector skin doesn't give any intermediate suggestions before you accept a word from the IME, and you have to type a space after the word finishes to get any suggestions.

I'll add Codex and OOUI to the tags. If they aren't the right people, they have a better chance of knowing who is.

Oops, T417294 should be merged into this ticket.

Keep this task as it's earlier and contains more information.

I thought the other ticket had more information on how to potentially solve the problem and had the right projects tagged since @Jdlrobson-WMF knows more about the issue. I'll update the tags here.

That task's description was written by me. I kept the project tags he set, though.

I appreciate the IME test steps you posted (for reproduction by non-Chinese users), therefore I thought this task should be kept.

Bewfip updated the task description. (Show Details)