Page MenuHomePhabricator

Better fallback handling in Kartotherian's language selection
Open, Needs TriagePublic


Some time ago, we ask at T112948 for the display of map location names in the user's language. This is working but only if a location name is given in the OpenStreetMap (OSM) database just in the user's language. There is a big discrepancy between about 250 Wikipedia languages and only a few language variants at OSM. And in consideration of the many locations there will be nobody to translate Millions of street names to all possible languages. On the other hand, Wikidata normally does not know anything about these objects.

That's why I propose to extend the fallback handling in Kartotherian's language selection based on writing/script labels and weights. I will give an example: For a French reader of an article of an Israeli location it is more convenient to read map labels in a Latin script but not in Hebrew. Mostly, a French translation is not available but an English one. That means an English label could be a better fallback than a Hebrew one.

We solved a similar problem at German Wikivoyage to present suitable street names in locations' listings. Base to do this is a table like this. Among others this table contains the Wikidata writing entity id and a weight derived from the count of Wikipedia edits as a measure of the language importance. We got the writing entity id by a SPARQL request and some manual additions.

In multiple steps, we are looking for suitable labels in OSM database:

  1. Search for a label in the user's language.
  2. Search for labels in the same writing system like the user's language. For instance, French language uses the same system like English or German, Bulgarian the same like Russian or Ukrainian, Farsi the same like Arabic.
  3. In case of multiple results we select the language of the biggest weight.
  4. Additional fallbacks could be English or French labels because many readers can read and speak these languages.
  5. Last fallback is the official language -- like now.