Page MenuHomePhabricator

Accept timezone as +00:00 for xsd:dateTime values
Closed, DuplicatePublic

Description

The Wikidata date parsing code assumes that datetime must end in "Z", whereas this is not a requirement for the standard. Moreover, if the datetime is produced by python's datetime.isoformat(), it will always be in the 2017-08-23T18:34:16+00:00 format, without the 'Z'.

Note that this is a warning, and the date is still stored, but it produces a lot of noise in the logs. This query shows no warnings if +00:00 is replaced with a Z

prefix osmroot: <https://www.openstreetmap.org>
INSERT {
  osmroot: schema:dateModified2 "2017-08-23T18:03:04+00:00"^^xsd:dateTime .
} WHERE {}
20:24:47.328 [com.bigdata.journal.Journal.executorService58564] ERROR c.b.r.internal.LexiconConfiguration IP:127.0.0.1 UA:Jetty/9.2.z-SNAPSHOT - Invalid date format:  2017-0
8-23T18:22:08+00:00: value=2017-08-23T18:22:08+00:00
20:24:47.328 [com.bigdata.journal.Journal.executorService58564] WARN  o.w.q.r.b.i.l.WikibaseDateExtension IP:127.0.0.1 UA:Jetty/9.2.z-SNAPSHOT - Couldn't create IV
java.lang.IllegalArgumentException: Invalid date format:  2017-08-23T18:22:41+00:00
        at org.wikidata.query.rdf.common.WikibaseDate.fromString(WikibaseDate.java:45) ~[common-0.2.5-SNAPSHOT.jar:na]
        at org.wikidata.query.rdf.blazegraph.inline.literal.WikibaseDateExtension.createDelegateIV(WikibaseDateExtension.java:76) ~[blazegraph-0.2.5-SNAPSHOT.jar:na]
        at org.wikidata.query.rdf.blazegraph.inline.literal.AbstractMultiTypeExtension.createIV(AbstractMultiTypeExtension.java:73) ~[blazegraph-0.2.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.internal.LexiconConfiguration.createExtensionIV(LexiconConfiguration.java:711) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.internal.LexiconConfiguration.createInlineLiteralIV(LexiconConfiguration.java:660) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.internal.LexiconConfiguration.createInlineIV(LexiconConfiguration.java:521) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.lexicon.LexiconRelation.getInlineIV(LexiconRelation.java:3355) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.lexicon.LexiconRelation.addTerms(LexiconRelation.java:1808) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.rio.StatementBuffer$Batch.addTerms(StatementBuffer.java:1951) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.rio.StatementBuffer$Batch.writeNow(StatementBuffer.java:1881) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.rio.StatementBuffer$Batch.access$1000(StatementBuffer.java:1645) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.rio.StatementBuffer$DrainQueueCallable.drainQueueAndMergeBatches(StatementBuffer.java:888) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.rio.StatementBuffer$DrainQueueCallable.call(StatementBuffer.java:827) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at com.bigdata.rdf.rio.StatementBuffer$DrainQueueCallable.call(StatementBuffer.java:795) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at java.util.concurrent.FutureTask.run(java.base@9-internal/FutureTask.java:266) [na:na]
        at com.bigdata.util.concurrent.LatchedExecutor$1.run(LatchedExecutor.java:121) [bigdata-core-2.1.5-SNAPSHOT.jar:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@9-internal/ThreadPoolExecutor.java:1158) [na:na]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@9-internal/ThreadPoolExecutor.java:632) [na:na]
        at java.lang.Thread.run(java.base@9-internal/Thread.java:804) [na:na]

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Could you please provide the query that produced that backtrace?

@Smalyshev i added the query to the description above

Yes, currently WDQS only parses dates with Z timezone. We could support other timezones, but not a big priority since Wikidata does not use timezones.

@Smalyshev we don't need to support timezones. The problem here is that in some languages like python, UTC time (Zulu) is created as +00:00 instead of Z. Both forms are fine according to ISO 8601. I think Wikidata should simply support both to mean the same thing. Per this post and looking at wp article.

Smalyshev renamed this task from Possible incorrect datetime parsing in wdqs to Accept timezone as +00:00 for xsd:dateTime values.Aug 28 2017, 11:59 PM
Smalyshev triaged this task as Medium priority.
Smalyshev moved this task from Incoming to Blazegraph on the Wikidata-Query-Service board.