Page MenuHomePhabricator

Numbers in Odia (and other non-Latin characters) are not auto-converted to Latin in Wikidata
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Go to any Wikidata item (such as a person's page).
  • Add a new statement requiring a date, such as "date of birth".
  • Type the date in the Odia digits, eg. ୨୭ ସେପ୍ଟେମ୍ବର ୧୯୬୨

What happens?:
A warning reading "The time value is malformed." appears.

What should have happened instead?:

Any date value should be converted into Gregorian date since the Odia digits and months are already mapped on Translatewiki, on-wiki templates, and most importantly, on Wikidata.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia): Wikidata

Other information (browser name/version, screenshots, etc.):

Event Timeline

psubhashish1 renamed this task from Number in Odia (and other non-Latin characters) are not auto-converted to Latin in Wikidata to Numbers in Odia (and other non-Latin characters) are not auto-converted to Latin in Wikidata.Oct 25 2024, 7:25 PM

Unicode's data includes the numeric value for decimal digits (see chapter 4 of the standard and annex 44). As far as I can tell, that data covers all digits in MediaWiki's $digitTransformTable variables, except for Classical Chinese (MessagesLzh.php), and also covers over 40 other scripts not supported by MediaWiki.

It should be possible to use that data to convert most digits to ASCII when entering dates or quantities, regardless of the current interface language, and even if MediaWiki doesn't support it.

P72210 is a list extracted from https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt. P72211 is a more compact list with only the first and last codepoints for each set of digits.

T338115 is a related ticket, which is about years not being output correctly.