Page MenuHomePhabricator

"Failed to dump Q12129 (Value must be at most 127 characters long.)" when dumping Wikidata as TTL
Closed, ResolvedPublic

Description

Command line:

/usr/bin/php5 /srv/mediawiki/multiversion/MWScript.php extensions/Wikibase/repo/maintenance/dumpRdf.php --wiki wikidatawiki --shard 4 --sharding-factor 6 --batch-size 1500 --format ttl --flavor full-dump --no-cache

yields:

[failed-to-dump]: Failed to dump Q12129 (Value must be at most 127 characters long.)

This didn't show up in Monday's JSON dump, so this is probably related to one of the recent edits:
https://www.wikidata.org/w/index.php?title=Q12129&type=revision&diff=635245443&oldid=631529032

Event Timeline

thiemowmde triaged this task as Medium priority.Feb 21 2018, 6:32 PM
thiemowmde moved this task from incoming to needs discussion or investigation on the Wikidata board.
thiemowmde subscribed.

The message is from the DecimalValue constructor, which is used in the QuantityValue constructor. This situation can happen when an edit is made via the API, and a quantity is submitted as a floating point number instead of a string. The code in the DecimalValue that converts floats to strings, but does this in a way so it can violate it's own limitations. Basically: The float is converted to a string with 100 decimal places. If the number before the decimal point is longer than 27 characters, the conversion fails with said error message.

This is very closely related to T155910: Erroneous digits in QuantityValue. I already prepared a fix in https://github.com/DataValues/Number/pull/115, which might solve both issues.

Change 425777 had a related patch set uploaded (by Hoo man; owner: Hoo man):
[mediawiki/extensions/Wikibase@master] Update data-values/number to 0.10.0

https://gerrit.wikimedia.org/r/425777

Change 425777 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Update data-values/number to 0.10.0

https://gerrit.wikimedia.org/r/425777

hoo claimed this task.
hoo removed a project: Patch-For-Review.

From next week on it should not be possible to add such invalid data anymore.