Render interlanguage links in SkinTemplate.php sidebar ucfirst per context language rules
Closed, ResolvedPublic

Description

The sidebar links are sentence case, lets use ucfirst() of the context Language object.


Version: 1.20.x
Severity: normal

bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz37705.
Krinkle created this task.Via LegacyJun 18 2012, 9:17 PM
SPQRobin added a comment.Via ConduitJun 18 2012, 9:34 PM

Done in https://gerrit.wikimedia.org/r/#/c/11949/ though I used wgContLang->ucfirst()

(Also removing "regression", since it was a change I was aware of and not necessarily a regression.)

siebrand added a comment.Via ConduitJun 18 2012, 9:50 PM

I don't th

siebrand added a comment.Via ConduitJun 18 2012, 9:53 PM

I don't think this is a valid request. An antonym is an antonym. Making the first letter a capital that should be lower case makes it incorrect. so it's either left alone, or a translation is provided for the language name through for example the CLDR extension.

IIRC there is an open issue for providing localized names for interlanguage links.

Nikerabbit added a comment.Via ConduitJun 19 2012, 6:22 AM

How about :first-letter {text-transform: capitalize} ?

Krinkle added a comment.Via ConduitJun 19 2012, 9:19 PM

(In reply to comment #3)

I don't think this is a valid request. An antonym is an antonym. Making the
first letter a capital that should be lower case makes it incorrect.

@siebrand: I disagree. The language name value is set correctly (previously they were all starting with a capital, and as of recently the languageNames registry was fixed so that languages that refer in their language to the name in lowercase are in lower case). So when the language name is embedded into a message it will now start with lowecase/uppercase as appropiate.

But just because the default display should be lowercase, doesn't mean it should always be lower case. For instance, in French the language name is to be written in lower case by default. But I'm pretty sure that if a newspaper has a headline, paragraph of sentence that starts with their language name, it has the first character in uppercas (e.g. "French foo does bar"), because that context enforces it to be so. Now list items may not be sentences, but they are still a certain context.

The language name doesn't change significantly just because it starts with a lower case, lets not create non-issues.

@Niklas: That could be an alternative solution, it depends.

  • If the context in this list is grammatically valid to require uppercase, then it should be done on the server side regardless of the currently selected skin and/or css.
  • If the grammar context does not require it, then it could be outputted as-is on the server side, and the style applied in the CSS of one or more skins per design choice (just like a few years ago with the German words that should be lower case, they were outputted correctly lower case, and some skins enforced all uppercase or all lowercase, and other left them as is).
Hugo.arg added a comment.Via ConduitJun 21 2012, 7:46 AM

I think you made a problem out of nowhere. Who was that "language expert" who decided which link should start with capital letter and which in lowercase? Look at "беларуская" and "‪Беларуская (тарашкевіца)‬". The same languages, the same grammar only different writing forms. Why they are in different case? Or "lietuvių" and "Žemaitėška". Samogitian language follow all syntax rules of Lithuanian why should it be from capital letter? "SiSwati" then is a total nonsense. "si" is a prefix and it should be "siSwati". It seems, that all these "unknown" languages which have small wikipedias were left randomly. I strongly suggest to return normal order with capital letters as now the sidebar looks like a mess and it's more difficult to found any link here when your sight must change from capital to lowercase letter again and again. That was a poor try to make some kind of internationalization where it was not necessary at all. As I told before, now you have problems for nothing from nowhere. Why to change things who are already working well?

Nemo_bis added a comment.Via ConduitJun 21 2012, 11:48 AM

(In reply to comment #6)

Why to change
things who are already working well?

Hugo, I'm sorry but I have to say that your comment is entirely unhelpful (bordering personal attacks) and doesn't belong to this bug.
The issue is very clear and is being worked on, please don't mess up this bug or you'll make it hard to resolve it. General discussions on the capitalization feature should be in wikitech-l, single reports of errors in separate bugs or other ways suggested in https://translatewiki.net/wiki/Language_support_team. Thanks.

Nouill added a comment.Via ConduitJun 21 2012, 5:24 PM

The change is unaesthetic, useless and non-consensual.
The rule which put several lowercase and several uppercase letter in a list, depending on the typological rule of the native language, is very very strange. Because I think in many language a list without sentence, have almost always uppercase letter. (At least, in french) So, in fact the change doesn't internationalize the interface.

saper added a comment.Via ConduitJun 21 2012, 7:17 PM

Speaking for myself only and not for community of any language, seeing "Polski" on the sidebar always reminds me somebody invented this from the English-centric point of view (it's just like saying "English", just translated). In reality we don't speak this way, although now this has become commonplace on the Internet unfortunately.

I stumbled upon this problem one day on Commons, which has a JavaScript feature to greet users in their language and offer a switch. It was so clunky so I proposed this change:

https://commons.wikimedia.org/w/index.php?title=MediaWiki:AnonymousI18N.js&diff=54275686&oldid=54256563

Of course, we can't fit a sentence like "Ten artykuł po polsku" ("This article in Polish") but we can probably say something like

  • "wersja polska" ("Polish version") or
  • "w języku polskim" ("in the Polish language") or
  • "po polsku" ("in Polish")

Only first one could probably be capitalized, as it makes a good title as well (none of them forms a proper sequence).

So probably we should really have "en français" and "po polsku". I guess lowercase "на русском языке" would be fine as well. I think some translations already include the word "language" ("Bahasa Indonesia" if I understand correctly).

But for many languages (certainly many Slavic ones) using some phrase (a pronoun + language name or three words) will be much better than just some grammatical form of the language name.

Maybe something similar to the abovementioned change on Commons should be attempted. Without it we can discuss capitalization of the first letter forever.

matmarex added a comment.Via ConduitJun 21 2012, 8:10 PM

I wouldn't agree with Marcin.

While it's true that "polski" is very rarely used as a noun in Polish (it usually functions as an adjective or an adverb, for ex. in expressions like "język polski" /the Polish language/ or "po polsku" /in Polish/), the usage in interwiki list does not bother me.

Also, the interwiki list is "language nautral", using the simplest possible name of language in that language. Forms like "en français" or "po polsku" could be confusing to people who do not speak those languages. Forms like "français" or "polski" shouldn't be confusing to anyone.

Of course, if it was ever decided that this list should use translated language names, it could make sens to use forms such as "po polsku" or "po angielsku" in Polish translation, and "Polish" and "English" in English one.

(Disclaimer: I only speak Polish and English on conversational level.)


And back on topic, I think that making the languages names in interwiki list variable-case is wrong. All other links in the sidebar start with uppercase letters; the fact that these are names on languages shouldn't matter here. This change is breaking the design.

saper added a comment.Via ConduitJun 21 2012, 9:26 PM

I think there is no such thing as "language nautrality" (neutrality?).

There is reason why the list is not delivered in the current user interface language - people not knowing current language should be able to switch to the language their understand. So the first and firemost goal is to reach out to people who can understand the target language.

ქართული, ไทย or русский are not there to be understood by non-speakers.

saper added a comment.Via ConduitJun 21 2012, 11:03 PM

(In reply to comment #6)

Look at "беларуская" and "Беларуская (тарашкевіца)".

Or "lietuvių" and "Žemaitėška". Samogitian language follow all syntax rules of
Lithuanian why should it be from capital letter?

I have attempted to fix this in gerrit change 12553. Comments welcome.

And a more general remark - I hope this bug is applicable only to SkinTemplate.php skins, since lowercase language names look better when squashed horizontally, as in Classic a.k.a. Standard or Nostalgia skins.

TheDJ added a comment.Via ConduitJul 2 2012, 10:06 PM

This doesn't work for "norsk (nynorsk)" and "norsk (bokmål)" btw, because that includes the LRE mark as the first character.

SPQRobin added a comment.Via ConduitJul 5 2012, 1:09 AM

(In reply to comment #13)

This doesn't work for "norsk (nynorsk)" and "norsk (bokmål)" btw, because that
includes the LRE mark as the first character.

I think this wouldn't be a problem if we used CSS text-transform: capitalize;

matmarex added a comment.Via ConduitJul 5 2012, 10:04 AM

(In reply to comment #14)

I think this wouldn't be a problem if we used CSS text-transform: capitalize;

I don't think we can do it, since it uppercases the first character of every word, not just the very first one.

SPQRobin added a comment.Via ConduitJul 6 2012, 12:52 AM

(In reply to comment #15)

(In reply to comment #14)
> I think this wouldn't be a problem if we used CSS text-transform: capitalize;

I don't think we can do it, since it uppercases the first character of every
word, not just the very first one.

Not when using :first-letter

SPQRobin added a comment.Via ConduitJul 14 2012, 4:36 PM
  • Bug 38232 has been marked as a duplicate of this bug. ***
Nikerabbit added a comment.Via ConduitJul 14 2012, 4:56 PM

Can someone make patch for CSS fix proposal, so that we can check that it works in all common browsers?

matmarex added a comment.Via ConduitJul 14 2012, 5:24 PM

(In reply to comment #16)

(In reply to comment #15)
> (In reply to comment #14)
> > I think this wouldn't be a problem if we used CSS text-transform: capitalize;
>
> I don't think we can do it, since it uppercases the first character of every
> word, not just the very first one.

Not when using :first-letter

But the :first-letter is the LRE mark. I just tested and this doesn't work (at least on Opera 12).

saper added a comment.Via ConduitJul 14 2012, 5:34 PM

I think that the proper fix is to display HTML direction tags instead of Unicode characters; this way CSS has a chance to refer to the proper first character.

There are two possible solutions:

  1. Encapsulate language name output in some generic method (similar to wfMsgHtml) that generates HTML tags (when HTML is expected) instead of directional marks. This will probably be difficult as language names are not encapsulated as objects and we have things like (taken from Xml.php):
foreach( $languages as $code => $name ) {
    $options .= Xml::option( "$code - $name", $code, ($code == $selected) ) . "\n";
}
  1. Extend $coreLanguageNames to provide directionality as additional metadata and hack in the use of it whenever the name is used.
Matanya added a comment.Via ConduitAug 12 2012, 10:04 AM

robin, since https://gerrit.wikimedia.org/r/#/c/11949/ was merged, can this be closed?

SPQRobin added a comment.Via ConduitAug 15 2012, 10:14 PM

(In reply to comment #21)

robin, since https://gerrit.wikimedia.org/r/#/c/11949/ was merged, can this be
closed?

Maybe, though we should still find a solution for those that start with a LRE mark.

David_Levy added a comment.Via ConduitSep 24 2012, 3:09 PM

lifeisunfair wrote:

Is the problem with the "norsk (nynorsk)" and "norsk (bokmål)" links (related to the use of the LRE mark as the first character) being worked on?

Amire80 added a comment.Via ConduitSep 24 2012, 8:20 PM

(In reply to comment #23)

Is the problem with the "norsk (nynorsk)" and "norsk (bokmål)" links (related
to the use of the LRE mark as the first character) being worked on?

I fixed this in https://gerrit.wikimedia.org/r/#/c/24888/ .

matmarex added a comment.Via ConduitOct 7 2012, 11:44 AM

I submitted a different, and I believe better, solution for this bug as https://gerrit.wikimedia.org/r/#/c/27039/, reverting the fix by Robin.

(It depends on Marcin's and Amir's fixes.)

Add Comment