Page MenuHomePhabricator

[Bug] deepcategory created empty result set for umlauts
Closed, ResolvedPublic


Testing deepcategory more thoroughly, we think there might be a problem related to umlauts.
See the examples:
deepcategory:"Friedhof in Hamburg"

not working:
deepcategory:"Friedhof in Köln"
deepcategory:"Friedhof in Münster"
deepcategory:"Sakralbau in Gießen"

All examples contain at most one subcategory with no further sub categories, but all contain articles.

This bug is blocking the inclusion of deepcategory search in advancedsearch

Event Timeline

Lea_WMDE triaged this task as Medium priority.Apr 16 2018, 11:29 AM
Lea_WMDE created this task.
Lea_WMDE updated the task description. (Show Details)
Lea_WMDE raised the priority of this task from Medium to High.Apr 16 2018, 11:31 AM

It seems like other non-ascii characters like Cyrillic letters don't work too.

@Smalyshev do you know when you will have time to look into this bug and T188350#4133189 ?

deepcategory:"Friedhof in Hamburg" does search the category Friedhof in Hamburg but not it's subcategories Kriegsgräberstätte in Hamburg‎ and Jüdischer Friedhof in Hamburg‎ – presumably due to the umlauts in the names of those.

Looks like some problem with decoding results, since the query dump shown it searching for Friedhof_in_K%C3%B6ln and J%C3%BCdischer_Friedhof_in_K%C3%B6ln which is obviously wrong.

Change 427260 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/CirrusSearch@master] Decode category names received form the DB

Change 427260 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Decode category names received form the DB