Page MenuHomePhabricator

Wikidata wdtn:P214 values (VIAF) seem to be corrupt
Closed, ResolvedPublicBUG REPORT


Wikidata wdtn:P214 values for VIAF, as reported through WDQS, appear corrupt. See, for instance,

Issue surfaced at

To take one example: the wdt:P214 value for is 12148449524915690527, but WDQS reports the wdtn:P214 value as

Steps to Reproduce:

  • Query wdtn:P214 values compared to wdt:P214 values

Expected results

Actual Results:

Event Timeline

Assuming this is about Wikidata, hence adding project tag so someone can find this task.

Seems more likely it's a data input issue than a query issue @matej_suchanek, given that WDQS doesn't exhibit the same sort of error for analogous wdtn:, such as wdtn:P244 (Library of Congress authority ID) - see for instance

@Tagishsimon the WDQS tag is appropriate though, as it includes the pipeline for getting statements into the WDQS triplestore, and also any corruption issues happening there. Thanks for creating the ticket! #

The LoC data does indeed seem to be clean - compare versus, checking 100,000 cases.

Smalyshev added a subscriber: Smalyshev.

I think I know what is the problem there... VIAF is stored as prefix+number, so there might be an overflow there. There's code that is supposed to deal with it, but maybe there's a bug in that code.


The current (above) format of the triples (when it works) seems fine from a Wikidata perspective, but from a LOD perspective, shouldn't the triple just be something like:


Smalyshev triaged this task as Medium priority.May 26 2019, 12:25 AM

@Esc3300 adding more triples is possible, but should be discussed in a separate task.

I don't think the wdtn triples are needed. I'd just add the "wikibase:identifier" ones.

Smalyshev added a subscriber: Igorkim78.

This looks like Blazegraph URI handler bug: when the number fits unsigned int but not signed int, InlineUnsignedIntegerURIHandler is erroneously storing it as small byte value, due to this:

		if (value < 256L) {
			return new XSDUnsignedByteIV((byte) (value + Byte.MIN_VALUE));

For 12148449524915690527 the signed long representation is less than zero (-6298294548793861089), thus the bug happens.

Change 513244 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[wikidata/query/blazegraph@master] Fix handling of numbers that convert to negative longs

Change 513244 merged by Smalyshev:
[wikidata/query/blazegraph@master] Fix handling of numbers that convert to negative longs

Smalyshev moved this task from Next to Done on the User-Smalyshev board.

Should be fixed now.

Hi @Smalyshev. You've closed this as resolved, but a query like is still returning corrupt data.
How long should it take for the old corrupt values to disappear?

Doh, my fault, I forgot 0 is also a number. Will fix.