Page MenuHomePhabricator

wikipage.content should fire after content is appended to the DOM
Closed, ResolvedPublic

Description

In the past we have done this the other way around, for maybe a slight performance increase, however I think this should be changed because:

  • The first time wikipage.content fires (on page load - which must account for 99% of cases) the DOM nodes it fires on are attached. So having it fire on unattached nodes is inconsistent.
  • As a result, users may expect to be able to write JS that can measure DOM elements, leading to hacks such as https://gerrit.wikimedia.org/r/c/mediawiki/extensions/DiscussionTools/+/816194/5/modules/dt.init.js#46
  • The performance difference is likely quite small, and infrequent. wikipage.content firing on updated content happens rarely and at non-critical times, e.g. after editing a page with VE.

Event Timeline

Change 816202 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/core@master] Warn when firing wikipage.content hook on content which isn't attached

https://gerrit.wikimedia.org/r/816202

Change 816200 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/core@master] Live preview: Run wikipage content hook after DOM nodes are attached

https://gerrit.wikimedia.org/r/816200

Discussion from https://gerrit.wikimedia.org/r/816200:

Esanders
This was set up this way in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/101470, maybe as a performance optimisation to avoid a repaint, however we should not be running this hook on unattached DOM nodes as hook listeners may want to measure nodes etc.

Timo Tijhof
What's the use case for a forced sync paint during the hook firing by measuring a node? I'd like to help / offer help potentially solve that a different way than to encourage it by running then post-attach.

Esanders
I think there are a few OOUI widgets/mixins that do DOM node measurement. Maybe <gallery> slideshow mode. VE surfaces certainly require themselves to be attached before initialising. In general there are many more places that listen to this hook than fire it, so doing it this way around is going to lead to subtle bugs with anyone who doesn't test their hook listener with live preview.

Timo Tijhof
My instinct is that if an OOUI widget does that, and indeed I recall perf issues along those lines in the past, that that is essentially a bug in the widget and not something we should accommodate downstream. In general, either the widget is taking a short-cut by computing something in the constructor by re-using a method that is for reacting to changes and it could know the answer statically already, or it is meant to measure something after the initial render but doesn't need the information for the first render (e.g. measure current state and react to it if/when something changes). Either way, forcing the first render to happen mid-render, and then likely multiple such first renders for every widget that suffers from the bug, is going to be quite slow and problematic in the long run.

E.g. if a widget has to draw itself incorrectly once before it adapt, then that seems like something we should either try to avoid or embrace by doing that measurement and adaption from a rAF callback scheduled upon construction or attachment instead of in the ctor. After all, I assume OOUI itself may also be built up in a detached tree normally right? E.g. construct a tree of nodes and then append to the body when done.

OOUI widgets that assume attachment from the start seem like a bug.

If there are known bugs today that we can fix temporarily with this patch, then please reference and I'll disengage and focus on thinking along to see if we can find a long-term fix for those. But if not, then I don't think we should explicitly encourage it by making it possible this way.

Perhaps I'm missing how common this is, a concrete example of something that isn't a one-off like slideshow.js, but something we see more often in similar ways would help.

Alternatively, if I'm misunderstanding that this would likely lead to a pattern of multiple forced paints within a single hook firing, let me know as that's what's my current thinking is based on.

Esanders
Some OOUI widgets that take measurements on init:
TagMultiSelectWidget, which appends an input of a specific width at the end of a line of tags, or wraps to the next line if there isn't enough space.
Toolbars work out if there are too many tools to fit on one line and therefore trigger "narrow" mode
The only widget that handles this properly is MultilineTextInputWidget which uses #installParentChangeDetector to set up a MutationObserver to ensure the measurement happens after attaching.

I suspect many other bits of code are not going to go to this level of effort to handle an edge case they probably don't know about.

The performance gain here is pretty minimal - the vast majority of wikipage.content uses are in first page init which already happens after attach - this live preview is an edge case and the extra repaint it only going to make it as slow as first page init, but in doing so we fix a long tail of current a future hard-to-find bugs.

The main example we are fixing for is in DiscussionTools, which can highlight comments on init using Range.getBoundingClientRect (comments can span arbitrary unbalanced ranges). If we don't fix this we would have to add a setTimeout to that code when we re-initialise.

Timo Tijhof
RE: TagMultiSelectWidget/Toolbars: I suspect there is something I'm missing here. If these truly try to compute stuff in their constructor, then surely they'd be broken in almost all non-hydracting contexts (e.g. VE, modal, demos, etc.) as they'd start detached and thus would have to at the very least use requestAnimationFrame, setTimeout or something else to (re)do it once attached, right?

RE: page view, on initial page view JS runs async well after first paint, so the bulk of the content won't be forced repainted as it is already computed/rendered by then.

I want to be flexible here, but I'd like to understand what we're compromising, and pre my previous question: How is this not going to cause an anti-pattern-by-default of each thing that constructs itself doing a forced render of itself and thus require many paints for a single preview in the long run? And given presumably previews are working fine today, is it not fixible by a simple rAF() callback in the appropriate DT widget that we'd want or need for other reasons already?

I sense I'm likely missing something as I assume your instinct is correct, I'm just not understanding it yet.

Esanders
RE: TagMultiSelectWidget/Toolbars: Yes - they re-take measurements when they are populated/changed. For toolbars this is when #setup is called (loading in the tools), for TagMultiSelectWidget it is when an initial value is set. Widgets are usually designed such that if the initial value is empty then no measurement is required. In these widgets it can be individually documented that these methods are to be called after attaching, however users are unlikely to be checking if the parent they are attaching to is also attached (and if they found the parent wasn't attached, what are they to do?).

Timo Tijhof
Okay, so we've established that the constructors can't assume being attached, and that calculations happen in response to changes in environment (e.g. change events) and any initial calculation is the responsibility of the widget instance owner to call a method from rAF or at some other appropriate moment.

The question, which I indeed did not think of, is when should these owners call it given it's potentially not yet possible to do right away (or suboptimal for perf by causing multiple subsequent forced layouts for each such caller during a hook), and this local owner is not in charge of attaching the parent.

I believe the solution there is that in the case of asynchronous constructions, this parent owner would provide some kind of event, hook or callback as-needed. E.g. when asynchronous construction is truly needed, as may be the case I imagine for ve.ui or ve.ce scenarios where perhaps a particular subtree is not attached until much later. That's complex, but seems unavoidable.

In the case of simple synchronous constructs, such as mw.hook where it's always going to be attached at the end of the current cycle, I believe rAF() or setTimeout will always suffice both to wait until it is attached, but more importantly, to wait for all other to the DOM changes to have happened during that cycle and for them to have already been rendered in bulk naturally by the browser at the "right" time. By being detached at first, it forces one to use rAF, and after that all is well. If not already, we can documnet that it is expected that wikipage_content will be attached by the end of the current callstack/eventloop, and/or more abstractly that you can assume by the next rAF, rIC or setTimeout it will be attached.

I would agree it might be a better prototyping/demo-building experience to gracefully degrade to be attached first. After all, if everybody does the right thing, then making repeat DOM writes is as cheap attached as detached, so long as there are no reads or style calculations. The issue remains with how we detect incorrect usage beyond prototyping and demo-building. It seems preferred to have at least some common scenarios (basically everything except initial page view, so wikitext live preview, VE, etc) where this is naturally found instead of completely relying on e.g. me to regularly audit all callers and find violations which then may or may not require a major refactor to accommodate asynchronously due to it not being designed for.

Question 1:
Would you agree that such widgets must have a method like "setup" exposed to be called after they are attached? And that such method is needed to be called by the instance owner no matter what, whether directly after appending the widget or async via rAF, they have to call it either way. And thus that what we're discussing isn't whether such method needs to be written but when/how the instance owner calls setup()?

If all true - I believe the question that remains, what to do when setup() is called directly instead of via rAF. I believe calling it directly is basically only appropriate if applied to something selected from the document that was server-rendered. In pretty much any other context, it's something that was just created and should wait for the browser to render it first before reading the styles. I'd rather we encourage that with a documented hook like this, which I thought was fairly discoverable and also naturally run up against when forgotten as the calculations will fail. I'm open to other ways of making this discoverable to developers. Some kind of warning against calling setup() on DOM nodes that have only been attached in the same cycle instead of in later event loops and/or via rAF/timeout.

Question 2:
It is my understanding that this is not blocking development of anything. Please clarify if that is incorrect, I can prioritise reviewing this more regularly.

Question 1:

I think we are focusing in too closely on OOUI widgets and whether we can get people to write the most optimal code through better documentation. The real problem here is that 99% of the time wikipage.content fires (i.e. on page load) DOM nodes are already attached. Regardless of how well a downstream user has performance-optimised writing to the DOM, users of wikipage.content should have a consistent experience between first load and subsequent updates.

The performance difference is likely quite small, and infrequent.
[…]
The real problem here is that 99% of the time wikipage.content fires (i.e. on page load) DOM nodes are already attached.

The 99% is what matters indeed, but the impact there is not small or infrequent. The reason to not support style computations during wikipage.content, is precisely to improve performance of the 99% case of page views. The element will become attached eventually either way, so that isn't what we are optimising for. The objective is to encourage consumers of this hook to do fulfil their needs without relying on style or layout information. I have yet to come across situations where this wasn't feasible. So it is less about when, and more about if.

I agree that it may not be worth figuring out how to optimise code, if the code in question only runs in less critical areas (i.e. not on page views). In those cases a setTimeout workaround like we do for OOUI, seems like a reasonable way to quickly get the job done and not invest undue engineering time. This should be rare, with currently 1 or 2 known cases in production to my knowledge. If it is more common, perhaps we can help document and promote the common ways in which layout information can be avoided for initial rendering.

I agree that ideally hooks always enjoy the same context, but I don't think that's feasible beyond the documented scope of the hook itself. We should and ensure that the parameters passed to the hook, and their documented guruantees, are consistently fulfilled. The global state of the rest of the page, mw.config, and the overall DOM is inherently different between page views, wikitext source mode previews, ajax previews, VisualEditor surface, and VisualEditor post-edit. The attached-ness of the content node is only such aspect that we don't guruantee and that does differ. In practice, the single case that attaches it is also the most common one. There are many other aspects of the page DOM and JS global state that are similarly unique to page views and similarly not guruanteed elsewhere: sidebar contents, mw.config keys, categories, lang links, other parts of the mw-parser-output if $content is a partial VE update, etc.

I think maybe where we disagree is whether relying on layout computation should be easy. I think relying on layout computation should be possible and simple (when taken on with a workaround as technical debt), but not elegant or encouraged. That's why I prefer to discuss the specific case where this is needed, so that I can help solve the underlying problem in question.

However, if we believe it is a fair and common requirement, then indeed it doesn't make sense to go in the path I suggest. We could introduce something like wikipage.content.attached and place the burden on code firing the hook to notify consumers once it is attached. This burden may be non-trivial in cases where there is a separated concern between the creation of the content and the attachment. For example, in OOUI there is non-trivial code for detecting when a node is attached precisely because this is a non-trivial burden. If we go that way, we could then also log a warning if the hook is used during the initial page load.

Alternatively, if this is a common enough requirement, we could do as you suggest, and apply this principle to wikipage.content directly as-is (assuming we can fix all current callers in extensions and gadgets to comply with that requirement). Doing so would imho signal that we encourage layout calculations during page load, and/or that they do not play a significant role in page load performance. Otherwise, we'd want some way to disacourage that. And putting side how we discourage it, if the end result is that we generally won't and don't have to, we presumably don't need the requirement.

This feels like a paradox :)

I think maybe where we disagree is whether relying on layout computation should be easy.

It shouldn't be easy to create hard-to-find subtle bugs, that are only triggered by certain triggers of the wikipage.content hook.

While the issues around avoid slow style calculations are valid, I think they are separate to what I am proposing, which is just to ensure some consistency so fewer actual bugs arise.

I agree with this proposal, it just seems pragmatic.

Change 816202 merged by jenkins-bot:

[mediawiki/core@master] Warn when firing wikipage.content hook on content which isn't attached

https://gerrit.wikimedia.org/r/816202

Change 816200 merged by jenkins-bot:

[mediawiki/core@master] Live preview: Run wikipage content hook after DOM nodes are attached

https://gerrit.wikimedia.org/r/816200

matmarex assigned this task to Esanders.

Change 993212 had a related patch set uploaded (by Nardog; author: Nardog):

[mediawiki/core@master] upload.js: Fire wikipage.content with a single element

https://gerrit.wikimedia.org/r/993212

Change 993212 merged by jenkins-bot:

[mediawiki/core@master] upload.js: Fire wikipage.content with a single element

https://gerrit.wikimedia.org/r/993212

I see the warning when using the visual editor…

wikipage.content hook should not be fired on unattached content

By the way, as I stated in this comment https://phabricator.wikimedia.org/T349298#9599357

When using the visual editor, the hook is fired, providing a $content element that is completely wrong:

Capture: https://commons.wikimedia.org/wiki/File:Hook_param_$content_in_visual_editor.png

Code to reproduce:

mw.hook( 'wikipage.content' ).add( $content => console.log( $content[ 0 ] ) );

When using the visual editor, the hook is fired, providing a $content element that is completely wrong:

It’s not wrong: wikipage.content only promises that it’s fired when “wiki content” is added to the DOM; it doesn’t promise that it’ll be the “main” wiki content. Edit notices are also wiki content: for example, they could contain collapsible boxes and other things that need this hook to function properly.

On a related note, when looking at some JavaScript code that is applied to the galleries, I noticed this in mediawiki.page.gallery.js:

$galleries = $content.find( 'ul.mw-gallery-packed-overlay, ul.mw-gallery-packed-hover, ul.mw-gallery-packed' );
// Call the justification asynchronous because live preview fires the hook with detached $content.
setTimeout( () => {
    /* ... */
} );

Instead, we could either:

  • Run some loop looking at the $content[0].isConnected property. Something like "look every N milliseconds, stop after N attempts (if ever something unexpected happens and $content doesn't get inserted into the document)".
  • Remove the setTimeout() and add a code comment telling that $content is guaranteed to be inserted at the time the hook is fired.

If the latter is chosen, we should ensure there is a code comment (I would suggest naturally "T341349") that lets grepping the codebase to find all places where we rely on $content to be inserted into the document.