Page MenuHomePhabricator

Improve language-territory information for Norway (better use of GeoIP)
Closed, InvalidPublic

Description

The set of languages are not optimal for Norwegian and should be adjusted. Both to the official languages of Norway and to the languages typically learned at the schools. A request for updates are sent upstream, as I'm told that the Compact Language List are using CLDR: Territory-Language Information. If they can't provide better results then we should add our own additional data.

The requests sent upstream are available as

  • #9646 Southern Sami (sma) [OR]
  • #9647 Lule Sami (sma) [OR]
  • #9648 Kven (fkv) [OR]
  • #9649 Romanes (rom, romani language) [OR, lacks number of speakers]
  • #9650 Romani rakripa (rom, scandoromani language) [OR, lacks source for number of speakers]
  • #9651 English
  • #9652 German [no data on language proficiency; Eurostat for language learning]
  • #9653 French [no data on language proficiency; Eurostat for language learning]
  • #9654 Spanish [no data on language proficiency; no official data on language learning yet]

Languages from nearby countries are added as preferred languages given the project. That is Nynorsk, Danish, Swedish will be on the language preferrence list, T138973: Show languages that a wiki community added to the top of the interlanguage list in the Compact list. There will be a request for additional languages for Norwegian (bokmål) Wikipedia. There are already one for Norwegian (nynorsk) Wikipedia.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Nemo_bis renamed this task from Make result for geo-ip more sane for Norwegian to Improve language-territory information for Norway (better use of GeoIP).Aug 2 2016, 6:30 AM
Nemo_bis assigned this task to jeblad.
Nemo_bis triaged this task as Medium priority.
Nemo_bis added projects: Upstream, I18n.
Nemo_bis updated the task description. (Show Details)

Thanks for looking into language-territory data. Whenever possible, try to use official data such as Eurostat. There is an overview at http://ec.europa.eu/eurostat/statistics-explained/index.php/Foreign_language_learning_statistics , a very useful special at http://ec.europa.eu/public_opinion/archives/ebs/ebs_386_en.pdf and Eurydice http://eacea.ec.europa.eu/education/Eurydice/documents/key_data_series/143EN.pdf , and quite a bit of detailed data (though not always what we actually need) around http://ec.europa.eu/eurostat/web/education-and-training/data/database .

You may also want to move the reports to their correct component i.e. "supplemental".

P.s.: For now I created http://unicode.org/cldr/trac/ticket/9680 which should address the top 3 languages for each country, it took a while...

Norway isn't part of EC, and as such statistics from EC would not be official for Norway.

Norway isn't part of EC, and as such statistics from EC would not be official for Norway.

(I suppose you mean EU, not EC.) I think you're confused about Eurostat, it covers Norway too because the ESS includes all EU and EFTA states. See http://ec.europa.eu/eurostat/web/european-statistical-system/overview

EEC, EC, EU, same thing.

Use official statistics for Norway, don't insist on using statistics from other external providers. Eurostat is an external provider of statistics.

This holds for other countries too, use the primary official provider unless there are some specific reason not to use the correct provider.

EEC, EC, EU, same thing.

Not quite.

Use official statistics for Norway

The sources you mentioned so far include only private websites and https://www.norden.org (which is not a statistics entity, as far as I know). I just gave some advice to improve the sources or add some source where you didn't find any.

You are free to follow the advice or ignore it, as well as to provide sources which you feel are better or "more official" or to just leave your upstream tickets unsourced, but it's pointless to lecture me on your personal beliefs about the status of various entities.

"Samisk, kvensk, romanes og romani er definert som minoritetsspråk i Norge, og dermed beskyttet av Minoritetsspråkpakten." (Sami, Kvensk, Romanes, and Romani, are defined as minority languages in Norway, and derby protected by the Minority Language Charter) statement from the Norwegian Government. From Regjeringen.no: Minoritetsspråkpakten.

Some links for Sami languages

On number of Kven people

"Det er usikkert hvor mange kvener/norskfinner det er i Norge i dag, siden vi ikke registrerer personers etnisitet. 10 000-15 000 er et anslag som ofte benyttes i offentlige dokumenter, men det kan være altfor lavt. Antallet som snakker kvensk, er mye lavere og har sunket dramatisk de siste generasjonene."

Utdanningsdirektoratet: Kvener/norskfinner - Kvenene/norskfinnene i dag

On the language situation

"Høsten 2013 var det 594 elever som valgte finsk som 2. språk, men det er en stor nedgang siden skoleåret 2001/2002, da det var 1073 elever."

Utdanningsdirektoratet: Kvener/norskfinner - Språk

(CLDR still awaits for better sources.)

When someone don't even recognize the Norwegian and Sami government as a valid source, … ;p

It is my understanding that some kind of update is coming…

Not sure why this has been assigned to me, I am not in a position to implement any of this, I have only said this should be fixed.

Seems like unicode.org don't accept or even recognize the Norwegian and Sami government as a valid sources, so I guess this won't happen.

Ok, closing then per lack of reliable and comparable sources (http://sami-statistics.info/ is even offline now btw). Can be reopened when we find suitable sources (http://cldr.unicode.org/index/lang-pop-data-updates says «Recent government or NGO-sponsored census data are typically better sources»).