Page MenuHomePhabricator

WDQ should reject invalid datetime literals
Open, LowPublicBUG REPORT

Description

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples/advanced includes 2 queries with invalid datetime literals eg "2000-00-00T00:00:00Z"^^xsd:dateTime (mm=dd=00).

https://www.wikidata.org/wiki/Wikidata:Request_a_query/Archive/2020/02 includes a query with date literals eg "2000-00-00"^^xsd:dateTime that are doubly invalid (mm=dd=00 and missing time part).

GraphDB and other repos reject such queries with eg Query evaluation error: Invalid value 0 for Month field.
WDQS should do the same, to discourage bad practices and for query compatibility with other repos.

A few more tests:

select * {
  bind("2000-00-00T00:00:00Z"^^xsd:dateTime as ?x)
  bind(day("2000-00-00T00:00:00Z"^^xsd:dateTime) as ?x1)  # returns 1 ???
  bind("20-00-00T00:00:00Z"^^xsd:dateTime as ?y)  # accepted, but year must have at least 4 digits
  bind(year("20-00-00T00:00:00Z"^^xsd:dateTime) as ?y1) # returns 20
  bind(?x-?y as ?z) # 723180.0: divided by 365.24 gives 1980 years and a bit
  bind(datatype(?x-?y) as ?z1) # xsd:double
  #bind("20-00-00T00:00:00Z"^^xsd:dateTime - "20"^^xsd:gYear as ?t) # fails with Unknown error: 20-00-00T00:00:00Z
}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This is probably a Blazegraph thing, not sure how much influence we have over it. But I disagree that it should be fixed – FILTER(?dateOfDeath >= "2020-01-01"^^xsd:dateTime) is much more convenient to write, and in my opinion also more legible, than FILTER(?dateOfDeath >= "2020-01-01T00:00:00"^^xsd:dateTime).

Gehel moved this task from Incoming to Blazegraph on the Wikidata-Query-Service board.

Use FILTER(?dateOfDeath >= "2020-01-01"^^xsd:date) which is as convenient, but is also a valid literal (note the datatype).

The trouble with accepting invalid input is that further computations then become invalid. Eg how do you explain these results:

select * {
  bind("2000-02-30T00:00:00Z"^^xsd:dateTime as ?x) # returns 2000 ???
  bind(day("2000-02-30T00:00:00Z"^^xsd:dateTime) as ?x1)  # returns 2 ???
}

https://news.ycombinator.com/item?id=28283350 is a discussion between @Denny and someone else where he says "There is no difference between 7-7-2000 and 07-07-2000 in xsd".

I disagree because:

  • repos should reject invalid literals (and "2000-7-7"^^xsd:date is invalid)
  • repos don't normalize values when storing so if accepted, the two literals could be equal but won't be identical