Page MenuHomePhabricator

Deal with precision in “range”, “difference within range” and "contemporary" constraints
Open, MediumPublic

Description

Problem:
range, difference-within-range and contemporary constraints, which use RangeCheckerHelper to operate time values and make their decisions, show false positives rooted in a too simplistic, incomplete or ambiguous management of time values by Wikibase. Wikibase stores a more precise value than corresponds to the precision set by the user (e.g., the user types "1555" and sets a precision of decades, but Wikibase stores and will provide the original value, "1555", whose last digit is arbitrary), so identically rendered values behave differently. The arbitrariness of the value stored by Wikibase, the naivety of WikibaseQualityConstraints, or the inability of both to manage and communicate time ranges rather than exact time values affect the decisions of the three constraint types, whose behaviours are sometimes perceived as random or inconsistent.

Example:

BDD
GIVEN
AND
WHEN
AND
THEN
AND

Acceptance criteria:

  • The digits of a Wikibase time value that exceed the precision are irrelevant and do not affect any behaviour. Ideally, these digits are not stored.
  • A time value (range) violates a constraint if (and only if) each and every one of the more precise time values (ranges) contained in it does. If a time value (range) does not violate a constraint, no less precise value (range) containing it can do so.
  • Two consecutive time values (ranges) with the same precision intersect, as their minimum difference is 0: 1880-01-01T00:00:00.0000… - 1879-12-31T23:59:59.9999… = 0.
  • Two non-consecutive time values with the same precision never intersect, as their minimum difference is > 0.
  • In case of doubt or ambiguity, the most permissive decision should be made.

Notes:

Open questions:

Event Timeline

thiemowmde added a project: patch-welcome.
Lydia_Pintscher raised the priority of this task from Low to Medium.Mar 1 2018, 11:36 AM
abian renamed this task from Deal with precision in “range” and “difference within range” constraints to Deal with precision in “range”, “difference within range” and "contemporary" constraints.Sep 21 2018, 2:41 PM

Example of a false positive (source): item A has end time 1999, item B has start time 20. century (timestamp 2000); ContemporaryChecker reports a false violation, claiming the two do not overlap.

abian updated the task description. (Show Details)
abian removed a subscriber: Jonas.

Another example: Marcus Fabius Dorsuo has date of birth -350 and date of death -400 (permalink), both with century precision.

$ curl -s https://www.wikidata.org/wiki/Special:EntityData/Q1225697.json | jq -r '.entities.Q1225697.claims | .P569, .P570 | .[0].mainsnak.datavalue.value.time'
-0350-00-00T00:00:00Z
-0400-01-01T00:00:00Z

If it requires too much resources, maybe a manual "exception" on these statements could work instead.