Page MenuHomePhabricator

Add Sylheti Nagari characterset to MediaWiki's special character list as used in WikiEditor
Open, Needs TriagePublic

Description

Bangla Wikisource is receiving some Bangla language books in Sylheti Nagari script from the British Library. If Sylheti Nagari characterset is made available to WikiEditor 2010, it would be helpful for the community to transcribe and proofread relatively easier than now.

Also, there is a recent addition of Sylheti Wikipedia which can make use of it too.

Event Timeline

Change #1271800 had a related patch set uploaded (by Bodhisattwa; author: Bodhisattwa):

[mediawiki/extensions/WikiEditor@master] Add Sylheti Nagari characterset to MediaWiki's special character list as used in WikiEditor

https://gerrit.wikimedia.org/r/1271800

Change #1271803 had a related patch set uploaded (by Bodhisattwa; author: Bodhisattwa):

[mediawiki/core@master] Add Sylheti Nagari characterset to MediaWiki's special character list as used in WikiEditor

https://gerrit.wikimedia.org/r/1271803

I understand the desire to add this….

But we are relatively conservative with additions to this logic. We cannot send all of unicode on each and every single request to the edit page (and twice effectively), that doesnt make sense.

This to me seems like this would be used very selectively, only on very specific wikis. And i dont think this script has enough day to day use outside those wikis, to burden all editors of all wikis (wikimedia and 3rd parties) with this extra category and the extra bytes of pageload that go with it (or the door it would open for other less used scripts).

Maybe it is better to add it to specific wikis with gadget overrides? Or simply use the “often used” override on each of those wikis, that is tailored to being wiki specific.

This to me seems like this would be used very selectively, only on very specific wikis. And i dont think this script has enough day to day use outside those wikis, to burden all editors of all wikis (wikimedia and 3rd parties) with this extra category and the extra bytes of pageload that go with it (or the door it would open for other less used scripts).

I fully understand your concerns and your logical reasoning. It is true that this script will currently serve two wikis only as of now, so there is no point to add it for other wikis, and so, I am totally fine to withdraw this ticket. But, I did not open the door for less used scripts for the first time, it was already done before, for example, see Canadian aboriginals, Runes etc in the charactersets. which are almost never used anywhere or served very specific purpose or maybe used in one or two wikis. To maintain uniformity with the discussion here in this ticket, we need to assess the utility of these scripts and remove them too, if needed.

Having said that, personally, I see value in adding these under-represented scripts to every wiki. Their presence can help gain more visibility and usage. Most of the times, the communities which use them have no technical infrastructure anywhere or within Wikimedia platforms to support them. A few bytes of these codes can give them some hope. Coming from such a language community, which was digitally under-represented 15-20 years ago, I understand the reality of the struggle, these communities have to face to get minimal support for language technology.

Maybe it is better to add it to specific wikis with gadget overrides? Or simply use the “often used” override on each of those wikis, that is tailored to being wiki specific.

I have added the characterset into a gadget

Having said that, personally, I see value in adding these under-represented scripts to every wiki. Their presence can help gain more visibility and usage. Most of the times, the communities which use them have no technical infrastructure anywhere or within Wikimedia platforms to support them. A few bytes of these codes can give them some hope. Coming from such a language community, which was digitally under-represented 15-20 years ago, I understand the reality of the struggle, these communities have to face to get minimal support for language technology.

I don't object to that, but I do object to that with the current technical architecture that provides this functionality. It would be different if this was all lazy loaded and dynamic, and highly prioritized what people use/is local over what is available throughout the world. But that is not the architecture that is in place right now.