Right now, Parsoid stores attributes of a token as an array of (k,v,srcOffsets) triple in the Token.attribs property.
However, an attribute lookup is now an array scan (see Util.lookup) which is unnecessarily expensive.
A better attribs structure would be a map.
However, there seems to be two issues that get in the way of making this fix.
- key lookup is whitespace insensitive
- based on code in setAttribute in parser.defines.js, keys need not be strings.
- code in transformers seem to assume ordering of attributes (that attribute 0 is the template name, for example). Also, the order actually matters in some cases like template args.
(1) might be easier to work around.
For (2) and (3), the solution might be related. There are exactly two instances of new KV(tu.flattenIfArray(..), ...) in the PEG tokenizer and in this case, the key is an array of tokens. Both of them are where the template name or template arg (rare on a top-level page) is itself templated. So, in these cases, attribute key is not a string. Looks like the right fix is to add a synthetic kv pair for the template name instead of implicitly assuming the first attribute is the template name or template arg. i.e. new KV('templatename', array-or-string-here). The trouble here that needs fixing is a potential conflict with a name template arg called 'templatename'.
In any case, this array-scan based attribute lookup is probably a perf. hole waiting to be fixed.