Page MenuHomePhabricator

[Bug] Empty JSON maps are serialized as empty lists in XML dumps
Closed, ResolvedPublic

Description

The XML dumps of Wikidata contain many JSON serialization errors where empty maps {} are wrongly serialized as empty lists []. This affects many different fields, such as "claims", "aliases", and "descriptions", and it seems likely that all empty maps are serialized wrongly.

This is a major inconvenience to all users who for some reason or other prefer XML over the JSON dumps (which do not have the problem). In particular, the XML dumps are needed to get access to the full history of the pages. Using the same JSON as used in the JSON dumps would fix the problem.

Event Timeline

mkroetzsch raised the priority of this task from to Normal.
mkroetzsch updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 27 2015, 8:04 PM

Related to T73349 which was opened for the JSON dump.

Quoting @daniel from wikidata-l@lists

Actually... the XML dump should already be using the new code. The same code in
fact that generates the JSON files.
The problem is that old revisions get taken directly from old dumps, and do not
get re-serialized. I thought we had worked around this, but replicating the
elaborate setup MWF uses for generating dumps is hard, so testing is a pain...

Related problem in the mediawiki API output: T12887

Jonas renamed this task from Empty JSON maps serialized as empty lists in XML dumps to [Bug] Empty JSON maps sare erialized as empty lists in XML dumps.Sep 10 2015, 1:20 PM
Jonas set Security to None.
JanZerebecki renamed this task from [Bug] Empty JSON maps sare erialized as empty lists in XML dumps to [Bug] Empty JSON maps are serialized as empty lists in XML dumps.Sep 10 2015, 1:37 PM
hoo closed this task as Resolved.Oct 5 2017, 3:08 PM
hoo assigned this task to JanZerebecki.
hoo added a subscriber: hoo.

Probably also fixed, along with T12887.