Page MenuHomePhabricator

Do not get entity data from wb_terms when "prefetching" data for displaying on pages
Open, Needs TriagePublic

Description

PrefetchingTermLookup interface, with its simplest implementation BufferingTermLookup combine TermLookup and TermBuffer interfaces.

Apparently, the "prefetching"/buffering is done for LabelDescriptionLookup services, so that when the label/description of particular item/property is to be shown multiple times (e.g. the item appears several times on a page), the label/description is only fetch ones from the storage, and then kept in a in-memory buffer until the end of request.
This makes sense, but the way it is done is a bit cumbersome. LabelDescriptionLookup itself does not hold any buffer. It gets label/description of the entity from TermLookup. As all LabelDescriptionLookup instances used get PrefetchingTermLookup (ie. lookup that is also a TermBuffer) in their constructor, label/description lookups "magically" use buffered term-lookups.

In the course of T196882 LabelDescriptionLookup implementation(s) are being added that not rely on TermIndex/wb_terms table. In the case of "prefetching" those could be also doing per-process buffering, removing the need of using wb_terms for fallback in "prefetching".

Idea to try out under umbrella of this task: make LabelDescriptionLookup buffer itself, and explicitly. If possible remove PrefetchingTermLookup/TermBuffer, unless they are used by other services too (not-that-extensive check by @WMDE-leszek suggests they are not).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 5 2018, 2:53 PM

Apparently, the "prefetching"/buffering is done for LabelDescriptionLookup services, so that when the label/description of particular item/property is to be shown multiple times (e.g. the item appears several times on a page), the label/description is only fetch ones from the storage, and then kept in a in-memory buffer until the end of request.

AFAIK, the main reason it exists is to load all the labels for the properties to be displayed on a page in a single query. I've found code doing it for the client, but don't see this for the entity page (I remember it was there, but things have probably changed).

Change 444873 had a related patch set uploaded (by Aleksey Bekh-Ivanov (WMDE); owner: Aleksey Bekh-Ivanov (WMDE)):
[mediawiki/extensions/Wikibase@master] Count number of $entityIds being prefetched

https://gerrit.wikimedia.org/r/444873

Change 444873 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Count number of $entityIds being prefetched

https://gerrit.wikimedia.org/r/444873

AFAIK, the main reason it exists is to load all the labels for the properties to be displayed on a page in a single query. I've found code doing it for the client, but don't see this for the entity page (I remember it was there, but things have probably changed).

I guess the interface you mean is EntityInfoBuilder, and in particular its implementation SqlEntityInfoBuilder.
It indeed gets data from wb_terms table. If I am not mistaken, this class does not use any "prefetching" lookup service, but relies on making SQL selects directly.

This use case should also be explored here, definitely!

Change 446609 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/Wikibase@master] DNM Attempt to make EntityInfoBuilder not using wb_terms

https://gerrit.wikimedia.org/r/446609