Page MenuHomePhabricator

[Bug] wbparsevalue parses time to the wrong value
Open, MediumPublic

Description

https://www.wikidata.org/w/api.php?action=wbparsevalue&datatype=time&values=1994-02Z

gives

{
    "results": [
        {
            "raw": "1994-02Z",
            "value": {
                "time": "+1994-02-01T00:00:00Z",
                "timezone": 0,
                "before": 0,
                "after": 0,
                "precision": 11,
                "calendarmodel": "http://www.wikidata.org/entity/Q1985727"
            },
            "type": "time"
        }
    ]
}

Per the definition in mw:Wikibase/DataModel the default values for month and day should be 00 and not 01. So instead of "time": "+1994-02-01T00:00:00Z", it should be "time": "+1994-02-00T00:00:00Z". Looks like the Z is triggering it, because https://www.wikidata.org/w/api.php?action=wbparsevalue&datatype=time&values=1994-02 does give the right result.

Also see T107870 and T123888

Event Timeline

Lydia_Pintscher renamed this task from wbparsevalue parses time to the wrong value to [Bug] wbparsevalue parses time to the wrong value.Apr 3 2016, 11:08 AM
Lydia_Pintscher triaged this task as High priority.
Lydia_Pintscher moved this task from incoming to ready to go on the Wikidata board.
Lydia_Pintscher added a subscriber: thiemowmde.
thiemowmde lowered the priority of this task from High to Medium.May 24 2017, 1:15 PM

The answer is simply that Wikibase does not know anything about this short "YYYY-MMZ" ISO format, and does not support it.

I have to ask where an edit containing such a format does come from? Which software is used to make the edit? Where does the provided value come from? I assume it is not entered by a human.

This is what our chain of date parsers currently does:

  • We do have a YearMonthTimeParser. It succeeds in splitting the given string into "1994" and "02Z". It understands the first number is the year. But it can not make sense of the string "02Z". This is neither a number nor a known month name.
  • No other of the parsers we build can make sense of the string.
  • We fall back to PHPs build-in parser, which is where the result you see comes from.

Possible solutions:

  • We can teach the IsoTimestampParser we already have to understand the short "YYYY-MMZ" ISO format with a "Z" in the end.
  • We can add special case handling for the "Z" to the existing YearMonthTimeParser.
  • We can add a separate parser for this format.
  • We can make sure our fallback PhpDateTimeParser does not accept this format.
  • You can convert the format to YYYY-MM-00 before submitting.