Page MenuHomePhabricator

🔮Labels of linked entities on items in Wikibase GraphQL API
Closed, ResolvedPublic21 Estimated Story Points

Description

We want to build a GraphQL endpoint that allows users to query for labels of linked entities of items Offering this as an alternative to WDQS or making multiple API calls through Action API or REST API

Acceptance criteria:

  • Available through reading of an item
  • Response includes all the elements of the requested entity (labels, descriptions, aliases, sitelinks etc.)
  • Ensure that only the labels of linked entities (statement property, statement value items, statement qualifier and statement references) can be requested, and not everything else (for example, descriptions or subsequent statements and so on.)
  • Pagination is not needed

Schema in SDL:

type Query {
  item(id: String!): Item
}

type Item {
  id: String!
  label(languageCode: String!): String
  description(languageCode: String!): String
  aliases(languageCode: String!): [String]!
  sitelink(siteId: String!): Sitelink
  statements(propertyId: String!): [Statement]!
}

type Sitelink {
  title: String!
  url: String!
}

type Statement {
  id: String!
  property: PredicateProperty!
  value: Value!
  rank: Rank!
  qualifiers(propertyId: String!): [PropertyValuePair]!
  references: [Reference]!
}

type PropertyValuePair {
  property: PredicateProperty!
  value: Value!
}

type Reference {
  parts: [PropertyValuePair]!
}

type PredicateProperty {
  id: String!
  data_type: String # ideally enum but TBD
}

enum Rank {
  PREFERRED
  NORMAL
  DEPRECATED
}

type Value {
  # ...
}

  • Multiple instance of one field can be added to the same request, we want to figure out if there is an upper bound here

Task breakdown:

General considerations:

  • create a reuse/ domain
  • use the GetItem use case in the item resolver create a BatchGetItems use case
    • we will need a batch use case anyway for the upcoming batch item story and because GraphQL already allows requesting multiple items at once by using aliases
    • creating this as a reuse use case means that we stick to the control flow rule of REST ADR 1 - "Inputs and outputs of the system must flow from the user side through the business logic to the data side and back. The code on the outermost level must not skip the business logic."
  • create the following two reuse use cases:
    • BatchGetItemLabels
      • takes a list of item ids and a list of label language codes to fetch
    • BatchGetPropertyLabels
      • takes a list of property ids and a list of label language codes to fetch

Sub-tasks:

  • create the initial GraphQL server
    • delete GraphQL prototype code
    • on a special page
    • fetches all item data excluding statements (labels, descriptions, aliases, sitelinks)
    • handles invalid item id
  • add statements field to item data
    • include qualifiers and references
    • excluding value types other than wikibase-entityid and string
    • excluding labels of linked entities
  • create BatchGetItemLabels use case
  • create BatchGetPropertyLabels use case
  • allow fetching labels of linked properties and items
  • support additional wikibase core data types (those defined within wikibase repo itself): globecoordinate, monolingualtext, ...
  • handle unknown value types
    • ensure that the code doesn't error when encountering data types or value types it doesn't know by either filtering those statements out or displaying them as unknown values
  • handle value types added by extensions(?)
  • limit the number of item fields used at the root level to 5
    • error message TBD
  • introduce a feature flag for the special page (opt-in)

Event Timeline

WMDE-leszek renamed this task from Labels of linked entities in Wikibase GraphQL to Labels of linked items in Wikibase GraphQL.
WMDE-leszek renamed this task from Labels of linked items in Wikibase GraphQL to Labels of linked entities on items in Wikibase GraphQL API.
Dima_Koushha_WMDE renamed this task from Labels of linked entities on items in Wikibase GraphQL API to 🔮Labels of linked entities on items in Wikibase GraphQL API.Sep 17 2025, 10:12 AM

While working on the prototype we realized that it isn't obvious what the best interface for some of the item fields would be. Labels are particularly tricky in that regard, so I'll list some options below.

Language codes as keys

This is what we did in the prototype we've created a few weeks ago.

Schema:

type Item {
  labels: Labels!
  # ...
}

type Labels {
  de: String
  en: String
  # ... hundreds of other language codes
}

Example request:

{
  item(id: "Q71") {
    labels { en }
  }
}

Example response:

{
	"data": {
		"item": {
			"labels": {
				"en": "potato"
			}
		}
	}
}

Pro:

  • language codes being part of the schema means available languages get auto-completed
  • validation of language codes happens automatically on the schema level

Con:

  • not all language codes are valid graphql field names because they contain dashes which means we'd have to change them e.g. to underscores en-gb -> en_gb
  • bloats the schema with hundreds of language codes repeated for every similar field (descriptions, aliases)
  • schema depends on the wiki's language config

Labels as a list

This is what we did in the 2020 prototype.

Schema:

type Item {
  labels(languages: [String]): [Label]
  # ...
}

type Label {
  language: String!
  value: String!
}

Example request:

{
  item(id: "Q71") {
    labels(languages: ["en", "fr"]) {
      language
      value
    }
  }
}

Example response:

{
	"data": {
		"item": {
			"labels": [
				{ "language": "en", "value": "potato" },
				{ "language": "fr", "value": "pomme de terre" }
			]
		}
	}
}

Pro:

  • simple schema
  • easy to request multiple languages

Con:

  • unclear how to best handle missing labels (null "value" vs not including them in the list)
  • clients have to do additional work to filter the label(s) they want out of the list instead of directly accessing it

A single label field

A compromise between the two above.

Schema:

type Item {
  label(language: String!): String
  # ...
}

Example request:

{
  item(id: "Q71") {
    enLabel: label(language: "en")
    frLabel: label(language: "fr")
  }
}

Example response:

{
	"data": {
		"item": {
			"enLabel": "potato",
			"frLabel": "pomme de terre"
		}
	}
}

Pro:

  • simple schema
  • easy to deal with responses for the client

Con:

  • the query requires aliases to request multiple labels

I personally prefer the last option as it seems a cleaner option (simpler schema, no messing with language codes) than the first, and better UX for the client than the second. If we get feedback that it's too cumbersome to request multiple labels, we could additionally introduce a field exposing all available labels (filterable) as a list.

I personally prefer the last option as it seems a cleaner option (simpler schema, no messing with language codes) than the first, and better UX for the client than the second. If we get feedback that it's too cumbersome to request multiple labels, we could additionally introduce a field exposing all available labels (filterable) as a list.

We decided to go with this option.

How should we handle cases where a linked entity can’t be resolved—such as when it’s been deleted or redirected?

In case of Entity of type Item:

Current schema:

type ItemValue {  
  type: String!  
  content: ValueItem!  
}  
  
type ValueItem {  
  id: String!  
  labels: Labels  
}

Current request:

query {
   item(id: "Q44") {
    id
    labels { en }
    statements {
      property { 
        id 
      }
      value {
        ... on ItemValue { type item: content { labels { en } } }

Current response:

{
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "labels": {
                "en": enLabel

Cases:
1- Deleted item

  • Return null/empty

schema and request stay the same

{
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "labels": {
                "en": null
  • Return indicator with null value

Example schema:

type ItemValue {  
  type: String!  
  content: ValueItem!  
}  
  
type ValueItem {  
  id: String!  
  deleted: boolen
  labels: Labels  
}

Example request:

query {
   item(id: "Q44") {
    id
    labels { en }
    statements {
      property { 
        id 
      }
      value {
        ... on ItemValue { type item: content { deleted labels { en }
         }

Example response:

{
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "deleted": true 
              "labels": {
                "en": null
  • Return an error with null value

schema and request stay the same

"errors": [
    {
      "message": "Linked item Q12345 has been deleted.",
      "locations": [
        {
          "line": 10
        }
      ],
      "path": [
        "item"
      ]
    }
  ],
  {
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "labels": {
                "en": null

2- Redirected item

  • Resolve transparently (just return target label)

schema and request stay the same

{
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "labels": {
                "en": enLabel
              }
            }
          }
        }
  • Return target label and indicate redirect

Example schema:

type ItemValue {  
  type: String!  
  content: ValueItem!  
}  
  
type ValueItem {  
  id: String!  
  redirectedFrom: string,
  id: string,
  labels: Labels  
}

Example request:

query {
   item(id: "Q44") {
    id
    labels { en }
    statements {
      property { 
        id 
      }
      value {
        ... on ItemValue { type item: content { redirectedFrom id labels { en }
         }

Example response:

{
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "redirectedFrom": "Q12345",
	      "id": "Q67890",
              "labels": {
                "en": "enLabel"
  • Return null/empty

schema and request stay the same

Example response:

{
  "data": {
    "item": {
      "id": "Q44",
      "labels": {
        "en": "enLabel"
      },
      "statements": [
        {
          "property": {
            "id": "P67"
          },
          "value": {
            "type": "value",
            "item": {
              "labels": {
                "en": null