Page MenuHomePhabricator

🌯️ Adjust statement data structure in Wikibase REST API responses and requests
Open, Needs TriagePublic13 Estimated Story Points

Description

As a Wikidata data reuser I want to get statement data in a simple format so that I can access the necessary data easily
As a Wikidata data reuser I want to get statement data without redundant data so that I do not need to process data I do not need.

Responses containing statement data should use the intended structure.
Requests containing statement data (POST, PUT, also PATCH) should use the intended structure.

Changes:

  • mainsnak object is removed, its fields are raised one level in the hierarchy
  • value top-level field is added, consisting of two fields:
    • content (string or JSON object) - capturing the value of the statement (previously mainsnak.datavalue.value)
    • type accepting values novalue, somevalue, value -- previously mainsnak.snaktype
    • content field is omitted if the value is known to not be possible to be defined (type: novalue) or known to be unknown (type: somevalue)
    • values of reference and qualifier objects ("snaks") are also represented using a value object
  • introduce a top level object property holding both the id and previous mainsnak.datatype as data-type (note: NOT datatype)
  • mainsnak.hash field is removed
  • datavalue.type is removed
  • qualifiers is turned from a map of lists of snak objects into a list of snak objects
  • qualifiers-order is removed
  • references.snaks is renamed to references.parts and turned from a map of lists of snak objects into a list of snak objects
  • references.snaks-order is removed

Not included in this story (will be addressed separately)

  • remove altitude from globe-coordinate values
  • remove timezone, before, after from time values

Intended structure/representation (pseudo JSON - might contain mistakes/typos; ask if spotted something unexpected)

{
  "P92": [
    {
      "property": {
        "id": "P92",
        "data-type": "string"
      },
      "value": {
        "content": "TEXT",
        "type": "value"
      }.
      "id": "Q11$6403c562-401a-2b26-85cc-8327801145e1",
      "rank": "normal",
      "references": [
        {
          "hash": "b25ff4bd5398ca646c621e114e4498e2bd608fd4",
          "parts": [
            {
              "property": {
                "id": "P711",
                "data-type": "string"
              },
              "value": {
                "content": "My message to video game databases: We(kidata) come in peace",
                "type": "value"
              }
            }
          ]
        }
      ],
      "qualifiers": [
        {
          "property": {
            "id": "P92",
            "data-type": "string"
          },
          "value": {
            "content": "TEXT",
            "type": "value"
          }
        }
      ]
    }
  ],
  "P694": [
    {
      "property": {
        "id": "P694",
        "data-type": "wikibase-item"
      },
      "value": {
        "content": {
          "id": "Q123",
          "entity-type": "item",
          "numeric-id": 123
        },
        "type": "value"
      }.
      "id": "Q11$6403c562-401a-2b26-85cc-8327801145e1",
      "rank": "normal",
      "references": [],
      "qualifiers": []
    }
  ],
  "P476": [
    {
      "property": {
        "id": "P476",
        "data-type": "time"
      },
      "value": {
        "content'; {
          "time": "+2021-09-17T00:00:00Z",
          "precision": "11",
          "calendar-model": "http://www.wikidata.org/entity/Q1985727"
        },
        "type": "value"
      },
      "id": "Q11$350e511c-48f9-caaa-72db-2ec8822f4432",
      "rank": "normal",
      "references": [],
      "qualifiers": []
    }
  ],
  "P937": [
    {
      "property": {
        "id": "P937",
        "data-type": "quantity"
      },
      "value": {
        "content": {
          "amount": "+14.23",
          "unit": "1",
          "lowerBound": "+24.23",
          "upperBound": "+4.23"
        },
        "type": "value"
      },
      "id": "Q11$350e511c-48f9-caaa-72db-2ec8822f4432",
      "rank": "normal",
      "references": [],
      "qualifiers": []
    }
  ],
  "P5": [
    {
      "property": {
        "id": "P5",
        "data-type": "globe-coordinate"
      },
      "value": {
        "content": {
          "latitude": "52.52",
          "longitude": "13.405",
          "precision": "0.001",
          "globe": "http://www.wikidata.org/entity/Q2"
        },
        "type": "value"
      },
      "id": "Q11$350e511c-48f9-caaa-72db-2ec8822f4432",
      "rank": "normal",
      "references": [],
      "qualifiers": []
    }
  ],
  "P123": [
    {
      "property": {
        "id": "P123",
        "data-type": "string"
      },
      "value": {
        'type": "novalue"
      }.
      "id": "Q11$6403c562-401a-2b26-85cc-8327801145e1",
      "rank": "normal",
      "references": [],
      "qualifiers": []
    }
  ],
  "P124": [
    {
      "property": {
        "id": "P124",
        "data-type": "string"
      },
      "value": {
        'type": "somevalue"
      }.
      "id": "Q11$6403c562-401a-2b26-85cc-8327801145e7",
      "rank": "normal",
      "references": [],
      "qualifiers": []
    }
  ]
}

Further differences between the current structure inherited from "Action API" etc are documented for the WMDE team in the internal document: https://docs.google.com/spreadsheets/d/1yAxIaUodJNvsY_eWRvBTySMF4ytkfbu5hGEkNrBnvto/edit#gid=1820649401

Acceptance criteria:

  • Responses to GET, POST, PUT and PATCH requests that contain statement data use the intended structure
  • PUT, POST requests use the intended structure when validating/processing the input
  • PATCH request assume intended structure to be relevant for patch operations
  • Differences between the intended structure and the statement data representation in "Action API" responses/requests have been documented in the internal documentation

Event Timeline

Silvan_WMDE set the point value for this task to 13.

Notes from task breakdown:
Schema changes: - @Jakob_WMDE

  • create entries for the new format in repo/rest-api/specs/global/schemas.json

Create Serializers: - @Jakob_WMDE

  • top level statement serialization
    • creates id and rank field
    • the result of PropertyValuePairSerializer needs to be merged into the top-level structure
    • for qualifiers: calls PropertyValuePairSerializer and turns the result into a list
    • for references: calls ReferenceSerializer and turns the result into a list
  • PropertyValuePairSerializer
    • creates "value" and "property" fields
    • needs to look up the property's data type to fill the property.data-type field
  • ReferenceSerializer uses PropertyValuePairSerializer

Create Deserializers: - @Silvan_WMDE

  • PropertyValuePairDeserializer
    • delegates to value deserializer if value.type is value
    • creates the snak object
  • value content deserializer (to be used for defined values, i.e. not novalue or somevalue)
    • gets the serialization including the property id
    • looks up the property's data type based on the id
    • maps the data type to the value type to figure out how to deserialize the value (mapping via WikibaseRepo.DataTypeDefinitions or something similar)
  • top level statement deserializer
    • creates statement from id and rank
    • delegates to inner deserializers for the other fields
  • ReferenceListDeserializer
    • has a ReferenceDeserializer
  • qualifiers deserializer (probably just iterates over a list and calls PropertyValuePairDeserializer)

Use Serializers: - @Silvan_WMDE

  • plug them in
  • fix tests
  • use new schema

Use Deserializers: - @Ollie.Shotton_WMDE

  • plug them in
  • fix tests
  • use new schema

Cleanup: - @Ollie.Shotton_WMDE

  • remove the now unused schema parts
  • remove unused classes if any

Questions:

  • Is it correct to assume that the data-type field is not required for requests as it is derived information?
Jakob_WMDE renamed this task from Adjust statement data structure in Wikibase REST API responses and requests to 🌯️ Adjust statement data structure in Wikibase REST API responses and requests.Tue, Nov 8, 3:45 PM

@WMDE-leszek We spotted a difference in the description compared to what the existing item id value serialization looks like.

The example says:

{
  "property": {
    "id": "P694",
    "data-type": "wikibase-item"
  },
  "value": {
    "content": "Q123",
    "type": "value"
  }.

Note that "content" is mapped to "Q123" directly as a string, whereas in the action API e.g. wbgetentities serializes the value as an object:

"datavalue": {
    "value": {
        "entity-type": "item",
        "numeric-id": 5,
        "id": "Q5"
    },
    "type": "wikibase-entityid"
},

Agreeing, it fits better to be made separately. It is included in T322734 now.

I think all looks good, thank you!