Page MenuHomePhabricator

Remove type: statement from JSON serialization of statements
Open, Needs TriagePublic

Description

As a developer working with Wikidata, I want the entity data to not consume too much network traffic.
As someone hosting Wikibase instances, I want the entity data to not consume too much storage space.

Problem:
The JSON serialization of each statement contains a "type" field, which always has the value "statement", wasting 19 bytes per Statement (before compression). The field used to be necessary back when Wikibase distinguished between Claims and Statements, but it has been hard-coded to always contain "statement" since 2015 (Wikibase DataModel Serialization 1.4.0).

Example:

$ curl -s https://www.wikidata.org/w/api.php -d action=wbgetentities -d ids=Q12345 -d props=claims -d format=json -d formatversion=2 | jq .entities.Q12345.claims.P1050[0]
{
  "mainsnak": {
    // ...
  },
  "type": "statement",
  "id": "Q12345$1d677e6e-4bad-e489-bda8-7a0bd8ec38c0",
  "rank": "normal",
  "references": [
    // ...
  ]
}

Acceptance criteria:

  • API responses do not contain a "type" field for Statements (note: they must still contain a "type" field for data values!)
  • Stored contents of new revisions do not contain a "type" field for Statements (how to check)
    • Non-goal: we do not update stored content of existing revisions.
  • API requests that omit a type in statements succeed (but so do API requests that include it: it should be ignored completely)
NOTE: Like T217431, this is technically a breaking change that should be properly announced.

Event Timeline

Even Pywikibot, which still calls Statements “Claims”, doesn’t read the type when deserializing a Statement (Claim.fromJSON), although it does emit a type when serializing it (Claim.toJSON).

I just discovered that in API requests (wbsetclaim, wbeditentity, etc.), the type is not, as I had assumed, optional – if you omit it, you’ll get some sort of error (though it seems to vary depending on situation – I’ve seen invalid-claim error responses as well as Deserializers\Exceptions\MissingTypeException uncaught exceptions). So an additional acceptance criterion should be that the type is no longer required there. (This part is not a breaking change, so it could be done sooner.)