Page MenuHomePhabricator

Parser should cache the output of identical template calls as well as parameterless calls
Open, MediumPublic

Description

Author: happy.melon.wiki

Description:
At http://en.wikipedia.org/w/index.php?diff=prev&oldid=445197792 we see an edit which changed a huge number of calls to {{rating|1|3}} to calls to {{rating 1 3}}, which is a wrapper template which merely contains the former call. Doing so slashes all the parser statistics: preprocessor node count down by 96%, post-expand down by 89%, template argument size down by 99.9%. I haven't looked in detail about why this is, but some sort of beneficial caching is definitely not occuring in the more common case, and it should.


Version: 1.20.x
Severity: normal

Details

Reference
bz30425

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:56 PM
bzimport set Reference to bz30425.
bzimport added a subscriber: Unknown Object (MLST).

By caching templates, it shouldn't be forgotten that they could contain dynamic or randomly generated content, so it should be possible for parser functions to disable the cache, of course this must be done recursively then. I am not sure whether this is the case already, I just recall Extension:Variables behaving buggy on cached templates and when I last checked I think there already was some way for preventing caching but it didn't really work... Guess I will look into this again soon.

The reason the template expansion cache is only for invocations with empty arguments is just the difficulty of the implementation. I left it for later.

A naive implementation would involve traversing the whole tree under the <part> in order to generate a hash for fetching the expansion. That's O(N) in the node count, and if you had nested arguments, it becomes O(N^2) in the string length. It's a design goal to avoid such O(N^2) cases. So it's necessary to store subtree hashes in each of the node objects, which complicates construction slightly.