Page MenuHomePhabricator

Reduce usages of wikibase.entityPage.entityLoaded hook in frontend code
Open, LowPublic

Description

wikibase.entityPage.entityLoaded hook is pretty expensive and calls Special:EntityData. Right now, it's being used in every page load to fix T85499: wbEntity shouldn't be served on every page load but it's better to keep the hook for gadget developers but stop using it in our frontend code or lazy load it (like when someone wants to edit the entity).

Event Timeline

Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.

So I have been thinking about this. How to do it properly without breaking cache, ParserCache and still improve our performance. So I took extracted some numbers:
For item of Germany we have:

amsa@amsa-Latitude-7480:~/workspace$ python entitydata-json.py Q183
Total Size: 1411.276KB
Germany
labels 14.31KB, 1.01%
claims 1336.33KB, 94.69%
sitelinks 46.99KB, 3.33%

amsa@amsa-Latitude-7480:~/workspace$ python entitydata-json.py Q76
Total Size: 239.766KB
Barack Obama
labels 12.74KB, 5.31%
descriptions 5.93KB, 2.47%
aliases 7.15KB, 2.98%
claims 169.11KB, 70.53%
sitelinks 44.63KB, 18.61%

amsa@amsa-Latitude-7480:~/workspace$ python entitydata-json.py Q7251
Total Size: 115.433KB
Alan Turing
labels 7.46KB, 6.47%
descriptions 3.34KB, 2.89%
aliases 1.87KB, 1.62%
claims 79.42KB, 68.80%
sitelinks 23.12KB, 20.03%


amsa@amsa-Latitude-7480:~/workspace$ python entitydata-json.py Q57387675
Total Size: 18.196KB
Lucas Werkmeister
labels 4.84KB, 26.60%
descriptions 0.45KB, 2.48%
claims 12.45KB, 68.43%

amsa@amsa-Latitude-7480:~/workspace$ python entitydata-json.py Q32530356
Total Size: 8.434KB
<Some random category>
descriptions 7.60KB, 90.12%
claims 0.30KB, 3.57%
sitelinks 0.23KB, 2.74%


amsa@amsa-Latitude-7480:~/workspace$ python entitydata-json.py Q54703127
Total Size: 30.211KB
Disseminated infection with Balamuthia mandrillaris in a dog.
labels 0.20KB, 0.65%
descriptions 3.34KB, 11.06%
claims 26.43KB, 87.48%
amsa@amsa-Latitude-7480:~/

So, I don't think we should stop injecting terms to entities, they are usually very small and won't cause much network overhead and not injecting cost us a lot (how to handle ParserCache for different users with different languages, etc.) OTOH, sitelinks and specially claims has a huge network overhead (for example over a 1.3 MB for item of Germany) while it doesn't give us much benefit (it helps with editing which can be lazy loaded, even the whole entity can be lazy loaded when someone edits the item). Also we need to make sure none of default gadgets don't use this hook.

So I think we should inject terms as a mw.config variable and use it in termbox v1 and v2 while lazy loading the hook when someone tries to edit and stop using in every page view. Does it make sense to you?

So I think we should inject terms as a mw.config variable and use it in termbox v1 and v2 while lazy loading the hook when someone tries to edit and stop using in every page view. Does it make sense to you?

I’m not sure what you mean by lazy loading the hook… at that point, it’s no longer a hook, is it? It would be some sort of regular function that’s called when you try to edit the page. We could make that function available to gadgets as well, perhaps even declare it stable, and try to return the already-downloaded data – but I don’t think it makes sense to lazily fire a hook for all other components on the page, at whatever time the editing UI first happens to need that data.

Also, I think for terms it might still make sense to have a hook, even if it’s implemented via a config variable.