Page MenuHomePhabricator

BC (Before Christ) times are not correct in wikidata
Closed, InvalidPublic

Description

I find that BC times in Wikidata are usually 1 year ahead of the real year.

For example,
Rome is founded in 753 BC, but Wikidata says 752 BC
Cleopatra died in 30 BC, but Wikidata says 29 BC

but, there are correct ones, for example,
the birth date and death date of Virgil (Q1398) is correct.

I guess this has something to do with internal Georgian Calendar used by Wikidata, which says year 0 should be 1 BC, and year -1 should be 2 BC. And perhaps programmers didn't handle this very well when importing data.

Event Timeline

There is some discussion about at: https://www.wikidata.org/w/index.php?title=Wikidata:Project_chat&oldid=361900747#458_BC

The current situation for the user seems to be: https://www.wikidata.org/w/index.php?title=Help:Dates&oldid=361752708#Years_BC with the oddity that entries with year 0 and year 1 BCE both refer to year 1 BC.

Some technical questions are discussed at T99674 and T94064. Not sure if these impact users any further.

There may be a few dates that need correction (as any dates), but I don't think it's a general problem. Currently there are only about 8600 dates for years BC with a precision of 9 (year) and higher. Help checking these for errors is welcome.

Consider the item for Horace, Q6197. We know from Encyclopedia Britannica, among other sources, his actual date of death was 27 November 8 BCE of the Julian calendar. The user interface displays the same, 27 November 8 BCE. The documentation for the RDF and JSON seem to indicate they follow astronomical year numbering, and also ISO 8601 2004 edition, and acknowledge the existence of a year 0, so those formats should give the year as -7. If we use this url:

https://www.wikidata.org/wiki/Special:EntityData/Q6197.json

to access the JSON (which I understand is regarded by developers as cannonical) the result is "-0008-11-27T00:00:00Z". If the JSON really is cannonical, then the user interface is storing the input incorrectly and displaying the stored value incorrectly.

If we look at the RDF using this URL:

https://www.wikidata.org/wiki/Special:EntityData/Q6197.rdf

the result is "-0007-11-25T00:00:00Z".

The RDF software has correctly converted 27 November of the Julian calendar to 25 November of the Gregorian calendar, just as its documentation says it will, but it has an inexcusable contradiction about the year, compared to the JSON.

The situation is so bad that no documentation, no code, and no dates in the data base for years < 1 can be believed. We need a definitive decision that is strongly pushed in the face of everyone who deals with dates.

thiemowmde triaged this task as Lowest priority.

I'm afraid this report is not actionable at the moment. To what does "in Wikidata" and "Wikidata says" refer to? Where does it say what? I guess you are talking about the values you see in the QueryService, but I'm not sure. Please provide more information, especially exact steps to reproduce, where you see the values you mentioned, preferably with some weblinks we can try ourself.

@thiemowmde "in Wikidata" and "Wikidata says" refers to the Wikidata webpage. For example, the Wikidata page for Cleopatra (https://www.wikidata.org/wiki/Q635) says she died in 29 BC while Wikipedia says she died in 30 BC.

The problem here is that, the internal representations of BC years in Wikidata is inconsistent (also mentioned by @Jc3s5h). I think it is necessary to clarify which representation is right and which is wrong. Then we can focus on fixing those incorrect ones.

Ok, this is a user error then and nothing we can fix. Users of the API, bots, tools and so on must enter dates according to XSD 1.0, not XSD 1.1.

The internal representation is consistent and always was: -0001 in the database means 1 BCE, -0002 means 2 BCE and so on.

The most recent documentation is https://www.mediawiki.org/wiki/Wikibase/DataModel/JSON#time. We updated this just recently to reflect the status quo of our code base (the code and internal representation did not changed, the documentation was just not specific enough and left room for interpretation, which lead to mistakes like the ones you reported here).

Could these be flagged as well if no other tag from T105100 applies. @Addshore is currently running a bot.

@thiemowmde Thanks for clarifying the representation of BC years. That is very helpful!