In I85bc47111e62f18de8b1621927bafdba07d3ef1b, a number of points are raised about the contents of data-mw for extensions, how that should be standardized and allowed to deviate from the default the extension handler sets.
The spec currently says,
The data-mw attribute is a JSON object. It is meant as an extensible public interface, so more top-level members can be added. The top-level structure depends on the content type, with the main types being transclusions and extensions. See also the transclusion content section.
but the gallery extension removes some of those top-level members and ref overwrites some.
The questions we are trying to resolve are:
- What is the information that we capture in data-mw on the wrapper node for extensions? A descriptive name that captures it. "dataMw" is an operational / implementation specific property name that is not helpful.
- Does core Parsoid always set some information in this dataMw object that extensions cannot override and extensions can only extend that object with extension-specific information? Or, is dataMw a free-for-all that extensions can arbitrarily customize. To some extent, the answer to this depends on the previous one. The current state of affairs is that extensions can make arbitrary changes via the 'modifyArgDict' handler.
- Based on the two answers above, we will need a suitable configuration property that can be used by Parsoid to implement the desired functionality.
- Relatedly, 'extsrc' might be redundant information in some cases if the extension provides a HTML representation of it. Is it worth the consistency by duplicating a large blob of wikitext? But, perhaps this can also be handled by having the extension declare if it 'supportsHTMLEditing' or 'providesHTMLRepn' or something like that which lets Parsoid remove the extsrc attribute.