It is clear that currently something is severely wrong with storing and displaying of time values. Let's collect what is actually stored where and when as well as what is shown where and when. Then figure out if this is the correct way or not and if it is consistent.
|Open||None||T87764 Bugs related to time datatype (tracking)|
|Resolved||thiemowmde||T88437 figure out and fix current state of time storage and display|
- Mentioned In
- T87312: [Bug] User interface displays false birth and death dates
- Mentioned Here
- T93772: Rename TimeParser
T92265: Remove wrong "Gregorian" display from diff views
T87574: Wrong time parsing of ISO dates with day 00
T66084: Time data-type inconsistently zero-pads year value (dates earlier than year 1000?)
T87312: [Bug] User interface displays false birth and death dates
I see two problems, at the moment:
- The documentation in the TimeValue class is wrong. This is confusing but doesn't cause direct harm because this class doesn't do anything with this information. In fact, the interpretation of the timestamp and the calendar model should probably not be documented in this class at all.
- The diff view currently assumes the timestamp is always Gregorian (fixed in https://github.com/DataValues/Time/pull/32).
@thiemowmde, let's take this example: Galileo Galilei (Q307) died on 15 February 1564 (Julian).
As we all know, the Gregorian calendar was introduced only in 1582.
Therefore, the date generally stated on virtually all sources will be 15 February 1564. This is the date that most people, including historians, will expect to see, since it wouldn't really make much sense to use a calendar not yet in use anywhere at the moment of the event.
Anyway, just for the sake of discussion, let's calculate what would have been the date if the Gregorian calendar had always been in use since the beginning of time: then the date will be 25 February 1564 (Gregorian).
Now my question is: what should we users insert in Wikidata as the date when Galileo died?
- 15 February 1564 (Julian)
- 25 February 1564 (Gregorian)
- none of the above?
What is currently displayed is: 25 February 1564 Julian... which doesn't really make sense to me :( Looks like the user wrote 25 February expecting that it would have been automatically converted to 15 February, but as you said this doesn't happen (and it's not intended to happen).
I think all this confusion started because when you insert any date, the default calendar is always Gregorian. So if the user doesn't know, or doesn't care, or simply forgets, he/she will probably get a Julian date from the source, but will insert it as Gregorian because this is what the UI proposes as default. And this is where the error starts... :(
So, supposing that we just have to insert dates as-they-are, without any conversion done by the software, I think that in order to avoid this errors by the users, the UI should automatically switch to Julian when the user enters a date before 15 October 1582. The date itself should not be altered or converted, just the calendar URI should change to Julian. It should always remain possible to switch between the calendars, but it seems reasonable to me that in 99% of pre-1582 cases, what the users will try to insert will be a Julian date, and it sounds much better to me to treat it as Julian by default.
Probably most dates up to 1582 will need to be reviewed :(
Documentation says dates are represented in conformance with ISO 8601, with a fixed 11 digits for the year. ISO 8601 requires that all dates be expressed in the Gregorian calendar. We have no format defined to store a Julian calendar date. Considering that the authors of ISO 8601 did a crappy job on some important points, I suspect the Wikidata community would do an even worse job if they tried to define a date format standard. (The current mess certainly supports my pessimism.)
In response to Candalua's comment, consider that the Julian calendar was used in Europe. Outside Europe, some countries converted directly from a calendar that was neither Gregorian nor Julian to Gregorian. The Julian-to-Gregorian switch occurred at many different times in various European countries, and sources will usually use the calendar that was in force at the time and place an event occurred. Also consider that the dates that can be dated with a precision of 1 day have been increasing rapidly since 1582. Thus, a great number of dates to be added to Wikidata appear as Julian dates in sources. I do not favor having the software behave differently before and after 15 October 1582 because the need for conversion must be assessed on a case-by-case basis.
I wish to point out this edit: https://www.wikidata.org/w/index.php?title=Q692&diff=193453963&oldid=193042089
It creates a timestamp +00000001564-00-00T00:00:00Z
But of course month 0 and day 0 are not valid ISO 8601 values. The proper ISO 8601 representation of a date with a precision of 1 year, in this case, would be +00000001564. The Z would be appropriate in the case of Shakespeare. Also note that the Z, indicating the time is UTC (Coordinated Universal TIme) should in general be omitted, because normally the date is the local time for the place where the birth or death occurred. Usually UTC is not appropriate because UTC was not created until the early 1960s, and editors (or especially bots) will usually not be equipped or willing to figure out the time offset between Greenwich and the place of birth or death.
For dates expressed in calendar other than Julian and Gregorian, I think we probably need to add those calendars to the list of options.
For the countries which kept using the Julian calendar for some time after 1582, I think we should insert both the Julian date and the Gregorian one, so that there can be no confusion.
Of course the correct calendar must be assessed on a case-by-case basis: my proposal for the default calendar simply tries to minimize the number of cases when the correct calendar is different from the default. It will not always provide the correct default, as this is impossible to know, but it would be an improvement from the current situation.
All of the formatters and parsers need to be moved to the DV repo anyway (see the todos at the top of the classes)
As so much change is happening I expect the probable best path would be to decide how to use DV/Time ie.
- Are we going to start using before and after?
- Are we going to start allowing extra calendar models
- How are we going to display years?
- Are we going to stop showing 0s for the time as well as days and months?
And then write the new code into DV/Time before deploying and removing the code from WB/Lib
I would add to Addshore's comment that ordinarily dates of death and birth should be considered local time; usually there will not be sufficient information to convert to a date that begins and ends at midnight universal time.
There are quite some (under-statement!) patches up for review that fix different issues with the current time implementation. They are (mostly) not tracked by separate tickets, that's why I put the list here as a todo-list for reviewers. kudos are going to @thiemowmde!
All remaining changes here except https://github.com/DataValues/Time/pull/39 are tracked in separate tickets (see below), as far as I can see. So I propose to close this task as soon as https://github.com/DataValues/Time/pull/39 got merged.
https://github.com/DataValues/Time/pull/33 is tracked as T66084.
https://github.com/DataValues/Time/pull/32 was resubmitted as https://github.com/DataValues/Time/pull/49 and is tracked in T92265.
https://github.com/DataValues/Time/pull/27 is tracked in T87574.
https://gerrit.wikimedia.org/r/#/c/186791/ is tracked in T87574.