Sun, May 6
Is it possible to use language converter with specific conversion group on most messages and correct them only when language converter get it wrong?
Feb 24 2018
Dec 11 2017
If I understand correctly, while this api should be internal to applications and not used by user, however it'd be used by things like mobile clients, visual editor, content scrapper, and such to obtain information for user's viewing which might still subject to some of the limitations I mentioned above?
Remove Japanese Kyujitai request as might be using variant subtag instead of script subtag could be a better idea? Although there are also problems in using variant subtag
Dec 10 2017
Remove Nushu as use case related to the language and script can be covered by using monolingual code mis due to the lack of language code for Tuhua
Dec 9 2017
As mentioned by others, it've been almost a decade since the issue was raised
Those "raising" actions are illegal, please see Bug management/Phabricator etiquette, especially:
Report status and priority fields summarize and reflect reality and do not cause it. Read about the meaning of the Priority field values and, when in doubt, do not change them, but add a comment suggesting the change and convincing reasons for it.
Sorry, a better term would be, "since the issue was submitted".
Is it possible to do the renaming task for those wikis as of current status first and then deal with whatever bugs that would appear after the renaming is to be done? As mentioned by others, it've been almost a decade since the issue was <del>raised</del>submitted, and there will be more and more legacy issue need to deal with the longer it drags on (CX didn't even exists back in the day). Things like CX would be broken but those seem to be less important.
Accept-language header seems to be a bad idea for situation when the language or variant is not usually selectable in browser or terminal setting (Like you can't pick anything yue in chrome language setting page. Not in Microsoft Windows setting either which IE seems to read from there. Not on smartphone setting which mobile browser and app read from either.) So there are no way for user to configure these client softwares to send accept language header in language/variant that they would like to use to the server.
[Note: This is relevant as there are request to implement Hans-Hant conversion for yue.wp too]
[Note 2: It can be a way to detect what variant the user initially want, but probably not a good way to fixate the variant selection based on this]
- The linked discussion was about Hanmun = Classical Chinese documents, not documents written in Hanja-Hangul mixed script.
- The task is closed as it seems like it is not a good way to word this request this way for now. Will probably make a post on the community discussion page when I word it in a better way.
Dec 5 2017
Dec 2 2017
Nov 27 2017
hum edited task description accordingly
Nov 26 2017
Almost all the Hani text being discussed and used related to the nan.wp project now are Hant. Disregarding Hans for now and use Hani instead of Hant would probably do the job in the current setting but what about when mainland China Hans users start visiting and editing the site?
Nov 25 2017
Nov 23 2017
It's now working on my end.
Because language converter would need to cater to exceptions, the better way to do this is probably just open up a language and then only translate those that are different and then let the other terms fallback to Tagalog just like how zh_HK<>zh-Hant have been handled.
Existing syntax for special conversion rule have been documented at https://www.mediawiki.org/wiki/Writing_systems/Syntax . Some of the concern could have been addressed in the link and there are also no need to reinvent new syntax instead of using a syntax that have already established.
- British<>American English, Portuguese variants conversion
- Different options to enable/disable the system in various way, with additional user settings that allow custom rules and different separated set of rules
- Some words on language converter in editing mode
- sentence-based conversion tool
- Classical Chinese Kanbun conversion
- Mutliple parallel conversion
Instead of "always Simplified Chinese", a more proper description would be "always the language variant in the article source". The conversion between Simplified and Traditional Chinese variant on Chinese Wikipedia module is achieved by Language Converter. The language converter does not work in source editing mode and does not work in Content Translation either.
Nov 17 2017
What is the rationale of macrolanguage being not usable to identify text?
Are you implying that those monolingual language code I'm submitting does not represent anything useful? nan/cdo/hak-Hant/hans are language-script combinations being used to write wikipedia articles, and vi-hani, ko-kore, ja-Kyujitai are used to name people and things in respective countries. How do you write the name of "Ho Chi Minh City" in Vietnamese Han nom? The only place providing this info in wikidata for now is in the Japanese alias for the entry name. How about "Kim Jong-Il" in ko-Kore? Look at the Slovak alias. Is it better than having labels for each of these script variants?
Then mon, mon is ISO 639-3
Nov 13 2017
Nov 12 2017
mvf only refer to Mongolian spoken in Central part of Inner Mongolia while mn-Mong is written by all mn users.
Apr 7 2017
According to some pages I have read from google, it seems like in the US only the compilation of data is protected while data itself are not and the creation of databasr also need to have some creativity in order to make the database fulfil copyright law, and in the EU there is an extra protection of investment being put to collect, arrange and present data. So it seems like it should not have problem under the US law in most cases although it might be better to let a legal expert to answer the question ..
Mar 2 2017
Is it within the scope of this task that ordinary wikipedia with multiple page for every single concept written in multiple script cannot be linked to same wikidata concept entry?
Dec 21 2016
adding the tag because there're intention to make lzh wikipedia text run vertically
Oct 8 2016
Oct 5 2016
@GerardM but traditional mongolian script is like literary chinese, which is universal to every languages that were using it as their written form and thus it is invalid to say which language they belong to. Just like you can say Nihon Shoki is written in Chinese but you can't say it is written with Mandarin or Hakka. The situation with traditional Mongolian script is the same. And also, it would be incorrect [despite being a convention] to call those Mongolian text middle/classical Mongolian language just like you can't equate literary chinese to old/middle chinese, as there are still some changes being made to the written language that set the old language at that time apart from the written form continually being used.
Oct 4 2016
0. According to the "Requirements for a new language code" linked above, the WIP requirement for a new language code is a valid IETF tag not a valid ISO code
- Macrolanguages in ISO 639-3 are still individual languages in ISO 639-2, and definition of macrolanguage in ISO 639-3 is "clusters of closely-related language varieties that [...] can be considered distinct individual languages, yet in certain usage contexts a single language identity for all is needed". and thus macrolanguages should be treated as an lanuage with valid language code. And mn is a valid code and is currently used by Mongolian wikipedia, which also contain several articles written in traditional Mongolian script.
- See BCP 47 section 2.1.1 for details about uppercasing. https://tools.ietf.org/html/bcp47
- both khk, mvf, bua and xal can be written with Latn, Cyrl and Mong.
- mn-Mong is not only used for mvf.
- BCP 47 also stated that macrolanguage code can still be used instead of code for encompassed languge
- you can see mn_Mong_CN is a likely subtag in http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/likely_subtags.html
- You can see mn-Mong listed in IANA language subtag registry http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry (listed as redundant as it have the correct form and format defined by RFC4646 and all the subtag it used are defined in the document. See RFC4645 for detail.)
Jul 8 2016
@Roytam1 There are no such thing known as unified font in the world at least as of now, and they are not supposed to be unified. Alternative to region based font would be a font that are designed according to a particular regional standard or according to font developer's habit. See Unicode's FAQ about CJK for further detail. I am not familiar with server environment nor the software's setting, but in home environment when an application use a multiregion font without specifing a region, the result is often default to China's standard.
May 25 2016
From what I was told, many articles on Cebuano Wikipedia as well as some other Wikipedia with very high article-per-speaker ratio used bots to create articles from database, for instance [just for example] those bots could create a million article for 1st to 1 millionth asteroid automatically just by copying from database according to a user defined format. See https://ceb.wikipedia.org/w/index.php?limit=50&title=Espesyal%3AMga+Tampo&contribs=user&target=Lsjbot&namespace=&tagfilter=&newOnly=1&year=2016&month=-1 for example. I don't think this should be taken into consideration about what language would be useful to visitors.
Mar 26 2016
Mar 14 2016
Is T353 a subtask or a duplicate of this task?
Mar 2 2016
ah the official wikipedia android app 2.1.141-r-2016-02-10 as well as 2.0-r-2015-04-23
Feb 23 2016
CSS3 Vertical writing mode is now supported by 90%+ browsers around the world and supported by basically all non-opera-mini browser as per http://caniuse.com/#feat=css-writing-mode , also note that due to the limited support provided by mediawiki on vertical script, there're already some ppl created their own non-wikimedia mediawiki site by using their own hack/method.
Aug 6 2015
Jul 17 2015
bz9123000 should be part of the list too.
Jul 15 2015
I have just read some notations like bz19986, bug #19986 or bug 19986 in some older issues that are referring to issue number in bugzilla, is it possible for phabricator to automatically redirect all these to their issue # in phabricator, lile T21986 in this case?
Jun 5 2015
While the problem have not been resolved yet and EasyTimelines are still displaying witgiut text, I'd like to mention that according to http://wenq.org/wqy2/index.cgi?HanziStyles the wqy font is using China version's glyph. Once after the issue is fixed, should it also install another font that come with Taiwan version's glyph for users browsing the wikipedia in zh-tw/hk/mo, and is it technically viable ti display different font for people requesting different edition of the page? (Actually, should i file another bug for this?)
I don't think unifont should be use instead as according to http://wiki.debian.org.hk/w/Fonts it look like a wqy's font is an improved version over unifont in term of Chinese support.
Feb 14 2015
Some other links in wikipedia in general also act in this way, like links generated via Template:link on English Wikipedia.