Page MenuHomePhabricator

[Story] mw.wikibase: Use __index to lazy load entity contents
Open, MediumPublic

Description

We should return a stub only in mw.wikibase.getEntityObject and defer methods like getLabel to PHP. Only if the raw data is being accessed we would load the actual entity into Lua (that logic would be in a __index metamethod).

Things to consider/ check:

  • Does this break backwards compatibility in some way?
  • Are these tables inmutable then? (They probably should, but that's another issue)
  • Are there any unwanted performance implications of __index we need to be aware?

http://www.lua.org/pil/13.4.1.html

Related Objects

StatusSubtypeAssignedTask
Declineddchen
OpenNone
OpenNone
DuplicateNone
OpenFeatureNone
OpenFeatureNone
DuplicateNone
ResolvedNone
ResolvedNone
ResolvedNone
OpenNone
OpenNone
StalledNone
InvalidNone
OpenNone
ResolvedTpt
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
ResolvedLydia_Pintscher
ResolvedJarekt
ResolvedNone
OpenNone

Event Timeline

hoo raised the priority of this task from to Needs Triage.
hoo updated the task description. (Show Details)
hoo changed Security from none to None.
hoo updated the task description. (Show Details)
hoo added subscribers: hoo, aude, daniel, Lydia_Pintscher.
Lydia_Pintscher removed a subscriber: Unknown Object (MLST).

The only downside I see is the case that you first access the labels, which get fetched from the terms table, and later more of the data structure, which would cause the entire entity to be loaded. In that case, loading the full entity right away would have been quicker.

This issue would be mitigated by pre-loading not only labels, but also full entities, based on usage tracking: if the X or O aspect is used, pre-load the entire entity (and skip preloading of labels, maybe even push labels from the entity into the label cache).

The only downside I see is the case that you first access the labels, which get fetched from the terms table, and later more of the data structure, which would cause the entire entity to be loaded. In that case, loading the full entity right away would have been quicker.

This issue would be mitigated by pre-loading not only labels, but also full entities, based on usage tracking: if the X or O aspect is used, pre-load the entire entity (and skip preloading of labels, maybe even push labels from the entity into the label cache).

Yeah, that's something I would like to talk about in the meeting we have on Friday: A buffered term lookup which supports both loading from entities and the term table (or with whatever we replace the term table).

A problem I came across while poking at this is that we would need to call out to EntityLookup::hasEntity on every mw.wikibase.getEntityObject call to verify the entity exists (because that function returns nil in case the entity doesn't exist). If we need to unstub the entity later on, we would again need to get similar (but not the same) metadata from MariaDB to actually load the entity. That would make two database queries from one to load the entity into Lua. If people load more than a few entities into Lua, that could actually slow down page rendering significantly.

If we had a batch lookup and could in process cache/ buffer the entities we need for Lua this could be preloaded in one go, but we're not there, yet.

Batch lookup of entity metadata, based on usage tracking, sounds like a good idea.

hoo lowered the priority of this task from High to Medium.Mar 26 2015, 12:05 PM

We talked about this and decided to go with T93885 first. If that doesn't suffice or we hit further problems, we might still decided to do entity stubbing in some way. I no longer consider this a blocker for the initial deployment of arbitrary access (but it might turn out to be one after we hit an initial set of Wikis).

Jonas renamed this task from mw.wikibase: Use __index to lazy load entity contents to [Story] mw.wikibase: Use __index to lazy load entity contents.Nov 2 2015, 1:47 PM