Based on the investigation in T192700: Investigate an optimized language-fallback system for Maps internationalization and continuous testing and investigation, we've implemented a language fallback system per label on maps. The fallback system is designed to accommodate maps that are meant to be in informational articles in wikis in different languages. This purpose has been the main driver for how the behavior of language fallback for each label is eventually implemented, and will be tweaked if needed.
The algorithm
The fallback stages are as follows, where the process stops when a value is found:
- Look for value in the requested language
- Look for value in a language (or languages) that are specifically defined as fallback languages
- Look for a transliterated value
- Look for label in the local language
If no value is found, display no label.
Wikipedia itself already has a language fallback system for its interface translations; several languages have language fallbacks in case a specific translation is not found. We have collected this fallback structure into a JSON file and are using it as an initial fallback stage.
Specifications for the stages
Each stage follows the principles of assuming the desired map is intended for an informational article (rather than for needs of travel like Google maps, etc) and is relying on the language information and fallbacks that are already used in MediaWiki in general.
1. Value in the requested language
By default, this language is the language of the wiki. It is possible to override this language setting when using <mapframe> by using a parameter lang=.
For example, the map below:
<mapframe text="Downtown [[wikipedia:San Francisco|San Francisco]]" width=250 height=250 zoom=13 latitude=37.8013 longitude=-122.3988 />
If this map is posted in English Wikipedia, the default requested language (for any region specified) would be English (en). If the map is posted on Hebrew Wikipedia, the default language (for any region specified) would be Hebrew (he).
However, adding lang="es" would override the requested language to request the labels in Spanish, no matter where the map is posted.
(This will be available soon) Adding lang="local"will force the system to ignore any requested language, and display all labels in the local languages. This simulates the same behavior that exists before the i18n improvement.
2. Value in specified language fallback
MediaWiki uses language fallbacks already, especially for our UI translations. Not all languages have declared fallbacks, but those that do have a specific one that was specified by the communities and the Language Team. We have collected those directly from MediaWiki into a fallback JSON file: https://github.com/kartotherian/babel/blob/master/lib/fallbacks.json
If the specific language is not found, the system will look in that file to see if there are official fallback languages, and will attempt to get values in those languages.
3. Transliterated values
In OSM, there are values that are specified with a script suffix, like -Latn, -Cyrl and -Arab, as well as romanized data, like _rm suffix.
In this step, we look at the script of the originally-requested language (note: not the fallback languages, though those usually share the same script) and then we look for any value that has the suffix of the same script. If the requested language was a latin one, we add a sub-step where we also explicitly ask for romanized versions, like ja_rm for romanized Japanese or ko_rm for romanized Korean (both of these are part of the top 25 translated labels list).
4. Local language
If no label was found so far, we fall back to what OSM defines as the local language.
In some cases, OSM data has local language defined without that language having a definition of itself with the actual language code that it belongs to. For example, you may have a label in the USA (where the local language is considered English) that has local value "Portland" but does not have a label that is specifically declared to be in English. (More specifically, name field is filled in but not name:en)
See T192662: name:<local name code> is not always available in OSM
This is one of the reasons why we are trying not to be too forceful in creating more and more overrides before falling back to the language the system considers local.
Code
- kartotherian/babel package does the language manipulations and fallbacks: https://github.com/kartotherian/babel
- The fallbacks.json file assembled from MediaWiki fallbacks: https://github.com/kartotherian/babel/blob/master/lib/fallbacks.json