Page MenuHomePhabricator

Hebrew Chupchik/Geresh letter-agnostic ULS
Open, Needs TriagePublic

Description

In different keyboards the chupchik is configured with different characters:

In most keyboardsthere is no chupchik usually, and users use apostrophe'
In most mobile keyboards (iPhone, Android)chupchik character is presented׳

ULS should be agnostic to the kind of chupchik being used - so either user type צ׳כית or צ'כית - they should get the same results.

Event Timeline

eranroz raised the priority of this task from to Needs Triage.
eranroz updated the task description. (Show Details)
eranroz added subscribers: eranroz, Amire80.
Aklapper renamed this task from Chupchik-agnostic ULS to Hebrew Chupchik/Geresh letter-agnostic ULS.Oct 4 2015, 9:29 AM
Aklapper set Security to None.

I'm not sure the bug description is clear enough (even for me, as the creator) so I will try to re-describe the issue.
[Originally described in Amir's talk page in hewiki: https://he.wikipedia.org/wiki/%D7%A0%D7%95%D7%A9%D7%90:Sq12w6spwc9wlo3d ]

Some keyboards have geresh character as default in Hebrew, while others use apostrophe character.
To support linking and searching with letter agnostic behavior (so users with different keyboards can enjoy similar behavior) there are different approaches to solve it:

  1. Client side - Fix IME to use geresh/apostrophe. This is simple, but will not work for most users, as IME JS isn't usually used but native keyboard.
  2. Bot - For each page with Hebrew title containing apostrophe character, create a redirect from the same page title, but with geresh character - While this can be automated with bot, it will create a lot of redirect pages and requires regular running of such bot
  3. Mediawiki core / backend - (ab)use Language.php normalize/normalizeForSearch/convertForSearchResult and replace geresh with apostrophe - this means that pages containing geresh (not apostrophe) in their title will be inaccessible , however, this is infrastructure fix that doesn't require maintenance. As far as I checked in hewiki, there are no pages containing geresh in their title, so it is theoretical limitation.

Though 3 is the most "dangerous" (as it adds new limitation to valid titles) I think it is the most nicely solution. Any objection? other ideas?