Page MenuHomePhabricator

Numerals for years are not converted in date statements
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):
Visit:

What happens
Numerals in date statements are converted for the day but not for the year, e.g. viewing https://test.wikidata.org/wiki/Q167276 in different languages, "12" is converted whereas "2023" is not

What should happen
All the numerals, including the year, should be converted into the language of the UI.

i.e. if the UI is set to Arabic "2023" should be shown as "٢٠٢٣" and in Hindi it should be shown as "२०२३"

Event Timeline

This is "expected" behaviour in the sense that we document this failure mode in the code:

https://gerrit.wikimedia.org/g/mediawiki/extensions/Wikibase/+/7d20458b7ab7974f50b05c04ddb9cfbb69aff7a1/lib/includes/Formatters/MwTimeIsoFormatter.php#287

// TODO: The year should be localized via Language::formatNum() at this point, but currently
// can't because not all relevant time parsers unlocalize numbers.

The list of parsers can be found in TimeParserFactory:
https://gerrit.wikimedia.org/g/mediawiki/extensions/Wikibase/+/7d20458b7ab7974f50b05c04ddb9cfbb69aff7a1/repo/includes/Parsers/TimeParserFactory.php#68

and seems to include at least:

  • CalendarModelParser
  • DateFormatParser
  • IsoTimestampParser
  • MwEraParser
  • MwTimeIsoParser
  • PhpDateTimeParser
  • YearMonthDayTimeParser
  • YearMonthTimeParser
  • YearTimeParser

To be relatively sure that values that we correctly format with localised years can be parsed again, we would need to ensure that at least all of these parsers can handle localised year numbers.

Change #1055414 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[mediawiki/extensions/Wikibase@master] Unlocalise digits before handing them to the TimeParser

https://gerrit.wikimedia.org/r/1055414

Thanks for this Arthur, let's deprioritise this one for now then

Question mainly for @Arian_Bozorg: I wonder if we should announce this in some way? Is it a significant change? (Note: adding a feature flag for this would be a bit annoying, so I’d rather not do that. But we could e.g. merge and announce it on a Tuesday and then tell people that it’s available for testing on the Beta cluster until the next week, when it rolls out with the train.)

Yes, as it's a new format, I would say this would be a significant change here. I'll get in touch with @Mohammed_Sadat_WMDE to coordinate an announcement

There is a common misconception that the numerals currently used in Arabic writing (١, ٢, ٣, ٤, ٥, ٦, ٧, ٨, ٩, ٠) are called "Arabic numerals." However, these numerals are historically known as "Indian numerals" because they are derived from the Indian numbering system.

The original "Arabic numerals," also known as Ghubār numerals, are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. These are the numerals that were transmitted from the Islamic world to Europe during the Middle Ages and became the basis for the numerals used globally today.

Therefore, it is more accurate to refer to the numerals used in everyday Arabic texts as Indian numerals, while the numerals used globally today are considered the original Arabic or Ghubār numerals.

Therefore, I request that ١٢ يناير 2023 be changed to 12 يناير 2023.

Linking T368193 here since this change would bring us into line with the behaviour in Mediawiki, against which this bug / feature request ticket already exists.

@Amire80 just wanted to follow up on this and see if you have some advice on how we should approach this?

+1 the patch is good I tried with Arabic and reparsing work good in different precisions.

@Amire80 can you advise on this?

It’s been almost a year with no response… I’m inclined to say we should just merge the change. If some Arabic speakers want to use different numerals throughout MediaWiki, that’s a wider issue (already tracked at T368193); I don’t think it’s a valid reason to block this change, which after all is just a tiny fragment of the numerals seen on Wikimedia wikis (even the day part of date values still uses MediaWiki’s preferred numerals!), and which should also benefit other languages.

It’s been almost a year with no response… I’m inclined to say we should just merge the change. If some Arabic speakers want to use different numerals throughout MediaWiki, that’s a wider issue (already tracked at T368193); I don’t think it’s a valid reason to block this change, which after all is just a tiny fragment of the numerals seen on Wikimedia wikis (even the day part of date values still uses MediaWiki’s preferred numerals!), and which should also benefit other languages.

+1