Page MenuHomePhabricator

Punctuation is often language dependant
Open, MediumPublic

Description

Hello, in Russian Wikipedia for ranges of values we use 'Em dash' instead of 'En dash'. Is it possible to change it for pages range, when we use autogenerated source?

Also, French and Russian use a different kind of quote: «

Event Timeline

This is kind of a tough one. We don't really have multilingual support in the backend; we will automatically request things using the language code of the wiki but often it's not available and comes back in English. One work around would be to avoid punctuation by having a startpage and endpage field instead but a lot of templates don't have those parameters.

It's worth thinking about how we could make citoid more multilingual.

Mvolz renamed this task from Change a type of dash in Russian Wikipedia to Punctuation is often language dependant.Apr 3 2017, 10:37 AM
Mvolz updated the task description. (Show Details)

Another possibility is to use the "accept-language" parameter and try to apply some rules of our own in the back end, so if we get accept-language = 'ru', replace the en dash with em dash, replace " with « if accept-language = 'fr', etc.

Mvolz triaged this task as Medium priority.Apr 3 2017, 10:40 AM

In RESTBase we already set the langauage based on the domain the request is coming from before calling Citoid, which means that the language is always known. From this it follows that translators don't obey it fully.

Another possibility is to use the "accept-language" parameter and try to apply some rules of our own in the back end, so if we get accept-language = 'ru', replace the en dash with em dash, replace " with « if accept-language = 'fr', etc.

We could go down that rabbit hole. An alternative would be to see if the Zotero translators could support this (and we would need to support it in html-metadata as well then).

I have no idea what this task is about as I do not use VE, but Ukrainian also uses — instead of – in all situations and «» for quotations. («“”» for two level). In Mediawiki localisation of stuff like this is done on TWN in non-compulsory messages, I guess rather than hardcoding typography for specific languages you should use that.

TWN is not suitable in this situation because each citation is unique,
generated on the fly, and it's unnecessary because it's inserted into
wikitext. Users can simply fix the incorrect dash just by editing the
wikitext. But this is slightly inconvenient, which we are trying to
alleviate :).

Worth noting that on en wiki, the citation template renders the – in pages
as — ; This might be a solution that the citation templates on ru wiki can
use? To just render the correct dash instead of incorrect one that's in
the wikitext?

Each citation is unique OK but I do not see why it must hinder have something like "Mediawiki:Visualeditorwhatever-citoid-dashbetwixtnums" as a non-compulsory localisable message which then will be used in those situations instead of a hardcoded ndash somewhere.

Each citation is unique OK but I do not see why it must hinder have something like "Mediawiki:Visualeditorwhatever-citoid-dashbetwixtnums" as a non-compulsory localisable message which then will be used in those situations instead of a hardcoded ndash somewhere.

Yep, I think it is the really good idea.