Page MenuHomePhabricator

Implement Tatar language LanguageConverter
Open, MediumPublic

Description

tt converter classes made from kazakh classes by replacing kk to tt and adding some letters

this is code i have made from kazakh converter replacing kk to tt etc.

(i have made this several months ago, but has not worked further since then).

i will attach 6 files, 3 of them in messages folder, 3 are in classes folder. and a readme file is in attachment.

(and i have added some letters, that are not in kazakh language).


Version: unspecified
Severity: enhancement
URL: http://mediawiki.tmf.org.ru/wiki/%D0%91%D0%B0%D1%88_%D0%B1%D0%B8%D1%82

Attached:

Details

Reference
bz25537
Related Gerrit Patches:

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:19 PM
bzimport set Reference to bz25537.
bzimport added a subscriber: Unknown Object (MLST).
qdinar created this task.Oct 16 2010, 8:42 AM

Please submit these as a SVN diff against trunk.

kaldari renamed this task from imperfect but useful converter code for tatar language to imperfect but useful LanguageConverter code for tatar language.Jan 14 2015, 11:38 PM
kaldari set Security to None.
gerritbot added a subscriber: gerritbot.

Change 185090 had a related patch set uploaded (by Kaldari):
Adding LanguageConverter files for Tatar Language

https://gerrit.wikimedia.org/r/185090

Patch-For-Review

qdinar added a subscriber: qdinar.Mar 13 2015, 4:33 PM

hi . i have made a new converter and uploaded to gerrit :
https://gerrit.wikimedia.org/r/#/c/164049/

3 texts i made (tested) new converter with

Elitre added a subscriber: Elitre.Mar 29 2015, 7:31 PM

Change 185090 abandoned by Kaldari:
Adding LanguageConverter files for Tatar Language

Reason:
Replaced by change I18768eb1b13

https://gerrit.wikimedia.org/r/185090

new test

Change 164049 had a related patch set uploaded (by Nikerabbit):
Add Tatar LanguageConverter

https://gerrit.wikimedia.org/r/164049

@Arrbee, @Amire80, can review of this feature please be put on the Language Engineering team's workboard?

Reedy renamed this task from imperfect but useful LanguageConverter code for tatar language to Implement Tatar language LanguageConverter.Fri, Nov 22, 3:32 PM
Reedy removed a subscriber: wikibugs-l-list.

is there community consensus for this code? there were many discussions so it must be wanted. there are links to discussions here: https://tt.wikipedia.org/wiki/Кулланучы:Qdinar#википедиядагы_сөйләшүләр . standalone version of this converter is referred at https://tt.wikipedia.org/wiki/Татар_Википедиясе#TATLAT .

direct links to the standalone version, cyr->lat and lat->cyr, applied to tt.wikipedia.org:
http://https.tt.wikipedia.org.ttcysuttlart1999.aylandirow.tmf.org.ru/wiki/Баш_бит
http://https.tt.wikipedia.org.ttlart2012ttcysu.aylandirow.tmf.org.ru/wiki/Baş_bit

i personally do not "push" this project hard, because i generally dislike how this latin and also cyrillic alphabets are designed. for example, cyrillic/latin letter e is used for "i/e" sound, while there is also real "e" sound in words like "electron". it makes confusions with european languages and with turkish language. i am a programmer here, and wikipedians decided to use some authoritative alphabet, like all wikipedia is made, with authoritative sources, so i programmed using some governmental latin projects.

comment from code, i am going to mostly delete this from the code:
2017-02-18, author dinar qurbanov: by making this converter, i look like supporting it. but it is not so. *i think this alphabet has many disadvantages, i do not want to make it popular.* i regard this as historical museum showpiece. i think it should be ok to put it into tatar wikipedia, into conversion system of mediawiki. that converted pages are denied for search engines to index, as i know. exact version of latin orthography (and alphabet) was not chosen by voting by wikipedians, and wikipedians have not voted to edit rules of the tatar latin orthography to be used in wikipedia, so, i have decided to make this exactly as it was commanded by 2000's #882 resolution of cabinet of ministers of tatarstan. i use scans published by user Kitap ( https://tt.wikipedia.org/wiki/Татарстанда_татар_телен_дәүләт_теле_буларак_куллану_кануны#Татар_теленең_латин_язулы_орфографиясенең_гамәлдән_чыккан,_хәзерге_вакытта_рәсми_булмаган_кайбер_кагыйдәләре ), but i am not sure whether they are of resolution #882 or #618. that 2000's #882 resolution is canceled by russia law and by resolution #38 of 2013, of cabinet of ministers of republic of tatarstan, and new alphabet is accepted by 2013's law of tatarstan 1-ЗРТ, but that new alphabet is (even) less usable: there is no rules, no character for palatilasation in russian words, and the alphabets' table does not show all use cases of cyrillic letters. and i am going to mark this script as tt-latn-2000. i have found from gerrit comment that it is not ok. ("2000" subtag of variant is not registered in iana yet, but must, see https://en.wikipedia.org/wiki/IETF_language_tag ). then maybe i will mark as tt-latn-x-2000 where it is not variant, but in private-use subtag.

renamed 2000 to 2013, because wikipedians would not like it is named as 2000, because 2000's laws are canceled, but now there is 2013's law. there are several letter differences like ɵ -> ö, though ö was also somewhat admitted for computer usage. this converter uses ö. and there is no letter for hamza and palatalisation in 2013's law, and no rules/orthography are given. this converter uses apostroph for hamza and palatalisation, as used in 2000 law, and rules/orthography as given in 2000 law.