Page MenuHomePhabricator

Consider adding optimization parameters to Z6821/Fetch Wikidata item
Open, MediumPublic

Description

Description

Overview:
We can greatly reduce the bandwidth and processing needed for calls to Z6821/Fetch Wikidata item if we provide parameters allowing those calls to specify

  1. which content is needed (labels, aliases, descriptions and/or statements)
  2. which languages are of interest in labels, aliases, and descriptions
  3. which properties are of interest in statements.

(1) and (2) can be passed through to Wikidata's wbgetentities API, which already provides those filters. Filtering for (3) can easily be implemented in the orchestrator.

Benefits of using these parameters:

  • The JSON returned from Wikidata can be far smaller, greatly reducing the bandwidth used for fetches.
  • The resulting ZObject can also be much smaller, eliminating some orchestrator processing.
  • If the resulting ZObject is passed to another Wikifunction that needs to sift through labels/aliases/descriptions, the much-smaller size of those lists will also help with the performance of that sifting.
  • The smaller ZObjects will also mean less processing in WikiLambda for presenting them, and will be somewhat easier to browse in the UI.

Background:

  • Wikidata Items typically have labels in many languages, aliases in many languages, and descriptions in many languages.
  • Many calls to Z6821/Fetch Wikidata item will be made for a specific language-generation purpose, in which the target language is known in advance.
  • Similarly, many calls will be made in a context in which the value of a particular statement is needed, and the statement property is known in advance.
  • Some known performance problems related to Wikidata fetches (such as the problem underlying T378414) are caused by text snippets in many different languages (such as 90+ glosses in L1).
  • In addition to helping with current performance problems, these changes will also provide substantial network & processing savings over time.

We can also consider adding these parameters to the other Wikidata fetch functions; Z6821/Fetch Wikidata item is featured here because items tend to have the most labels/aliases/descriptions, and probably the most statements as well.

Completion checklist