Page MenuHomePhabricator

Language converter can't work on the results of Special:Search
Closed, ResolvedPublic

Description

The Language Converter can't support search results in Chinese wikis. It shows results only in zh-hans when typing zh-hans, and only zh-hant when typing zh-hant.
For example:
https://zh.wikipedia.org/w/index.php?title=Special%3A搜索&profile=default&search=2015年希腊纾困公投&fulltext=Search - no result
https://zh.wikipedia.org/w/index.php?title=Special%3A搜索&profile=default&search=2015年希臘紓困公投&fulltext=Search - 3 results

Event Timeline

SFSQ2012 created this task.Dec 9 2014, 8:41 AM
SFSQ2012 updated the task description. (Show Details)
SFSQ2012 raised the priority of this task from to Needs Triage.
SFSQ2012 changed Security from none to None.
SFSQ2012 added a subscriber: SFSQ2012.
SFSQ2012 updated the task description. (Show Details)Dec 9 2014, 9:08 AM

I guess you're asking: The LanguageConverter can't support search results in Chinese wikis. It shows results only in hans when typing hans, and only hant when typing hant.

Aklapper renamed this task from 中文版维基的搜索系统不支持繁简转换 to Language converter only shows results in zh-hans when typing hans, and only zh-hant when typing hant.
Aklapper triaged this task as Lowest priority.
Liuxinyu970226 added a comment.EditedDec 10 2014, 12:26 AM

Also for zh projects, we can cc an expert: @liangent.

Liuxinyu970226 renamed this task from Language converter only shows results in zh-hans when typing hans, and only zh-hant when typing hant to Language converter can't work on Special:Search.Jan 2 2015, 2:02 AM

To Aklapper, he's asking the Search page issue.

Liuxinyu970226 renamed this task from Language converter can't work on Special:Search to Language converter can't work on the results of Special:Search.Jan 2 2015, 2:03 AM
Aklapper changed the task status from Open to Stalled.Jan 2 2015, 1:41 PM

Sorry but it is still unclear here what are steps and settings to reproduce the problem, what is expected, and what actually happens. Feel free to edit the initial task description and title of this ticket.
Setting to "stalled"for the time being.

Byfserag updated the task description. (Show Details)Jul 10 2015, 6:58 AM
Byfserag updated the task description. (Show Details)Jul 10 2015, 7:15 AM
Restricted Application added a project: Discovery. · View Herald TranscriptJul 11 2015, 9:55 PM
Restricted Application added a project: Discovery-Search. · View Herald TranscriptJul 24 2016, 12:07 AM
debt added a subscriber: debt.

will need to investigate the lang convertor...this might take some time. moving to the later bucket for now.

Liuxinyu970226 added a comment.EditedAug 19 2016, 1:02 AM

But kindly Serbian also uses language converter, maybe also happened on Kazakh, Kurdish, Tajik, Uzbek?

TJones added a subscriber: TJones.Mar 23 2017, 2:44 PM

This should be helped by the new language analyzer being tested in T158203.

Both 2015年希腊纾困公投 and 2015年希臘紓困公投 get 14 results, and 2015年希臘紓困公投 is the top result in both cases.

Please feel free to try out the new language analysis and compare it to the current Chinese Wikipedia. (The content is week-old copy of Chinese Wikipedia and it has the search index and snippets, but not the articles.)

Cwek added a subscriber: Cwek.Mar 31 2017, 1:35 PM
Cwek added a subscriber: Arthur2e5.Mar 31 2017, 1:38 PM

Maybe you will be intersting.

Cwek removed a subscriber: Cwek.Apr 1 2017, 1:56 AM

I believe this is fixed. The examples given now return the same number of results (15), though in a slightly different order.

Chinese language wikis are now converting Traditional characters to Simplified characters for indexing and searching. It isn't perfect, but it generally works well. The slight difference in order in the results comes from slight changes in the scoring for exact matches between query and article. So, for example, a query with Traditional characters matches an article with Traditional characters a bit better, and the same for Simplified characters in query and article. It's usually does not make a huge difference, but it is definitely enough to reorder results that already have similar scores.

I suggest closing this ticket.

As for Serbian and others that use the Language Converter for changing the writing system used to display articles—unfortunately this requires a different kind of software, specifically an Elasticsearch plugin, that can do more or less the same job but do it inside the search engine. For Chinese, I found an external plugin that did the job. There may be something available for Serbian (see my soon-to-come comments in T138857).

debt closed this task as Resolved.Jun 29 2017, 6:13 PM
debt claimed this task.

Thanks for the insights, @TJones :)