Page MenuHomePhabricator

Be more lenient with whitespace when parsing date/time values
Open, Needs TriagePublic

Description

As a Wikidata editor speaking German, I want to be able to enter dates BCE in normal German.

Problem:
In German, the interface message wikibase-time-precision-BCE, to format year-precision BCE dates, is defined as

$1 v. u. Z.

When Wikibase uses this message to parse dates (tech note: see MwTimeIsoParser::getRegexpsFromMessageText()), it requires the   to be literally present in the input. It should, at the very least, also accept actual NBSP characters; and really, it should accept normal spaces too (probably \s+ in regex terms).

Example:
1 v. u. Z. parses, 1 v. u. Z. doesn’t.

Hungarian is also affected (i. e. 1).

Screenshots/mockups:

Screenshot from 2021-09-16 12-21-26.png (139×921 px, 23 KB)

Screenshot from 2021-09-16 12-22-03.png (139×920 px, 24 KB)

BDD
GIVEN I am looking at an editable item
AND the user interface language is German
WHEN I add a time value
AND use the input “1 v. u. Z.”
THEN the input is accepted
AND I can save the statement

Acceptance criteria:

  •   in messages matches actual NBSP in input
  •   in messages matches regular space(s) in input
  • &nbsp; in messages matches literal &nbsp; in input? in case users are already used to it? (compare this fix for French, where we continue to support 9<sup>e</sup> siècle)
  • &nbsp; in messages matches empty string (1 v.u.Z. without spaces)?
  • (regular space) in messages matches any whitespace in input? (i.e. turn not just &nbsp; but also regular space into \s+, so that e.g. 1. century with two spaces can be parsed)

Open questions:

  • (the ACs with question marks)