Page MenuHomePhabricator

[Task] Investigate LabelDescriptionLookup usage and possible refactoring (virtual labels)
Open, MediumPublic


With the WikibaseMediaInfo and Wikidata Lexicographical data extensions we are going to introduce new term types that are not labels, descriptions and aliases. We already phased out all places in our code base that assumed all entity types do have labels, descriptions, and aliases, or made them flexible by type checking against the LabelsProvider, DescriptionsProvider, and AliasesProvider interfaces first.

We figured that the LabelDescriptionLookup interface is one of the central remaining elements still assuming all entity types (might) have these term types. We may rethink this interface and its usages.

Things we already know (from discussions within the team, mostly with @daniel):

  • We should not phase out the LabelDescriptionLookup interface, and not split it into two independent ones. These two elements belong together, e.g. when rendering search results or links where the description is used as a tooltip to disambiguate Items that have the same label.
  • It might help to introduce a LabelLookup interface for all use cases that do not need the description.
  • We might rethink what "label" and "description" mean. Currently, these are the names of two fields on the Item and Property entity types, and we decided to not have identical fields on the other entity types. But what if we auto-generate labels and descriptions for Lexeme and MediaInfo entities, auto-generated from other terms? @daniel named this concept "display label" and "display description" for now. If we go this route and start to distinguish these two meanings, we might need to rename some of the existing providers and lookups into DisplayLabelLookup and so on.

Some critical places where LabelDescriptionLookup is used, which should probably also work in the same way (or in a very similar way) for Lexemes, Forms, MediaInfo, etc:

  • EntityIdValueFormatter as used when rendering statements (in HTML as well as Wikitext for ParserFunctions and Lua)
  • EntityIdFormatter as used when rendering edit summaries and when replacing page titles in listings. See e.g. HistoryEntityAction
  • TermsRdfBuilder providing rdfs:label, skos:prefLabel, schema:name, and schema:description.
  • TermIndexField (and subclasses) exposing the label and description to CirrusSearch, enabling prefix/completion matching for labels and display of descriptions in search results.
  • TermIndexSearchInteractor and EntitySearchTermIndex for formatting search results for the wbsearchentities API module
  • Entity.getLabel and Entity.getDescription in the Wikibase Lua module.
  • InfoActionHookHandler and EditActionHookHandler for displaying entity usage on action=info and action=edit, respectively

When designing a mapping between entity types and "virtual" labels and descriptions, the above use cases should be considered.

So this is a research and discussion ticket, as well as a place to bikeshed about how to name things. ;-)


Event Timeline

One crucial question in this context is when and where the "fake" labels and descriptions would be generated. Where is the logic that maps an Entity to its display label and display description? Should this be in the entity itself?

One may think that this could also be done during the lookup, in the LabelDescriptionLookup. But that will not work, at least not for labels, because we want to support lookup by label. In order to allow an efficient lookup (in a database table or in Elastic), the label has to be pre-generated during indexing.

So, when storing an entity, we should always derive its label and description (and, for the search index, also aliases). As far as I can see, the use cases we have for "display labels" are the same use cases we have for the wb_terms table (and EntityRetrievingTermLookup): lookup of labels (and descriptions) and lookup by label (and aliases).

daniel renamed this task from [Task] Investigate LabelDescriptionLookup usage and possible refactoring to [Task] Investigate LabelDescriptionLookup usage and possible refactoring (virtual labels).Feb 8 2018, 1:12 PM

Note that we also need to consider all usages of LabelsProvider and DescriptionsProvider when designing some kind of mapping for "virtual" labels.