Page MenuHomePhabricator

"wikipage.content" JS hook: $content should be consistent on live previews
Open, Needs TriagePublic

Description

Boilerplate code:

mw.hook( 'wikipage.content' ).add( $content => console.log( $content ) );

What element $content contains:

  • When viewing a page: #mw-content-text
  • When editing a page - no preview: #mw-content-text
  • When editing a page - during a regular preview: #mw-content-text
  • When editing a page - during a live preview: .mw-content-ltr ⚠️

This should be made consistent. Case in point, a bug varying whether live preview is enabled or not, would be quite unexpected and hard to spot.

Note that on edit pages, #mw-content-text also contains the edit interface, whereas .mw-content-ltr contains only the wiki page content. Though, I think we should unify to #mw-content-text, if we consider that on an edit page, the "page content" is actually the edit interface, which may include a page preview. Also, that's what the regular previews (probably much more widespread) already do.

By the way, there is still no element designed for getting just the wiki page content… So far, I'm mainly using $content / #mw-content-text, but have already encountered several, quite unexpected issues (for example on diff pages), that I solved by adding a .find( '.mw-parser-output' ) (and $( '.mw-parser-output' ) can't be used directly, as .mw-indicators also contains an occurrence of it).

Some refs:

Event Timeline

I don't think Parsoid is related here -- the classes you name here are part of the core OutputPage/ParserOutput mechanism. Parsoid is just matching legacy output.

It seems that the wikipage.content hook is doing exactly the right thing, you're just complaining that you'd rather use document.getElementId('mw-content-text') and have that always work?

I have just encountered an issue in a gadget because of this discrepancy: https://fr.wikipedia.org/w/index.php?title=MediaWiki:Gadget-Accessibility.js&diff=prev&oldid=212995332

(edit: then followed by https://fr.wikipedia.org/w/index.php?title=MediaWiki:Gadget-Accessibility.js&type=revision&diff=212999820&oldid=212995448, as there was another confusion because the hook is run at least once on all pages, even when there is no page content)

By the way, I just noticed that when running mw.hook('wikipage.content').add($content => ...) in the visual editor, the $content element is completely unrelated to the page content, see File:Hook_param_$content_in_visual_editor.png.

The problem is that the topmost known element is config.$previewNode, which is the #wikiPreview element (unless configured somehow else, but I'm not aware of any occurrence of such tweak).

As a reminder, currently the hierarchy is #mw-content-text > .mw-content-ltr.mw-parser-output in regular pages, and #mw-content-text > #wikiPreview > .mw-content-ltr.mw-parser-output in previews.

To provide the #mw-content-text element to the hook's $content parameter, I would suggest to :

  • Do $previewNode.closest( '#mw-content-text' )
  • And if for any reason this element is not found (which shouldn't happen), use the $previewNode element, as a fallback. Using $previewNode (i.e. #wikiPreview) would be less worse than using the current .mw-content-ltr.mw-parser-output, because doing $content.find( '.mw-parser-output' ) with the former would give a result, whereas the latter doesn't give a result, as the queried element is actually the $content element, not a child of it.

@Od1n Is there a functional problem here in terms of what a gadget is able to do or observe?

Consistency is imho not a problem by itself here, since that is very much a matter of perspective. I do not believe that the $content parameter must always hold an element with the same HTML ID. If such outcome were feasible, the hook would not need to exist, or at least would not need to have such a parameter. The reason this parameter exists is precisely so that extensions and gadgets do not have to be aware of (intentional) differences with how, where and why content is presented and rendered in a certain context. Instead, you get the parameter and it will refer to wherever the content was rendered this time.

Other examples where "consistency" is incompatible with the hook being useful:

  • WikiEditor extension, real-time preview split view on the side, while the rest of the edit page is still there above it.
  • MediaWiki core, gallery re-rendering.
  • VisualEditor modal preview of part of a page, while the whole page is also there in the background.
  • VisualEditor, TOC widget.

I would also include here the item you mentioned, which is MediaWiki core's live preview. There #mw-content-text is used, as always, for "consistency", to refer to the OutputPage as generated by the current route handler (whether that be ViewAction, EditAction, HistoryAction, or a SpecialPage). In other words, all output that isn't controlled by the Skin. This is a contract and requirement for various other parts of the system. In case of an EditAction/EditPage, that means only a portion of #mw-content-text is the editable wikitext rendering. And, "for consistency", the wikipage.content thus here give you access to only that portion, just like it always does.

On a related note, for the extensions in charge of rendering this content and firing this hook for you, they also hace the responsiblity of making sure mw-content-ltr and mw-parser-output are always applied to this element (or a parent element) for correct skin styling, TemplateStyles, and gadget support.

However, afaik there is no documentation, or promise, or indeed known use case, for why a consumer of wikipage.content would care what ID or class is set on the $content itself. Can you give a bit more context on what how you are using this hook, and why this poses a problem? I suspect there may be a misunderstanding somewhere, and there's probably a very similar way that would work instead without any of these complexities involved.

There are two issues discussed here, I'll try to answer about them separately.

First, there is the issue that $content parameter in the wikipage.content hook doesn't contain the parsed content, but an upper element, containing the "main" page content (all pages have a main content, including special pages, history pages, and so on). It's fine by itself, but: 1) many codes have been written overlooking this fact, and 2) there is still no selector to just get the parsed content, which is most often what people want (thus, don't be surprised if people use .mw-content-ltr, that's because they think it's their best available option).

I have encountered many cases where #mw-content-text (i.e. $content) is too broad, and I had to restrict to #mw-content-text .mw-parser-output, for instance:

And overall, it just feels really wrong to process elements that are not the parsed content, when the intent is to specifically process the parsed content.

I had thought about it, and I think the best would be to add a new hook, like wikipage.parserContent, whose $parserContent parameter would be the parsed content. Of course, the hook wouldn't be triggered when there is no such content (history pages, etc.)


Now, quickly back to the initial issue of this ticket, the fact that the element is different on live previews. Currently I have to add code specifically because of this issue, and at parts that I would have preferred to not complexify.

function getPageContentElement( $content = null ) {
    if ( $content ) {
        if ( $content.is( '.mw-parser-output' ) ) {
            // $content is already the .mw-parser-output element, [[phab:T349298]]
            return $content;
        } else {
            // it may find no element, in particular if it's
            // an edit page that doesn't have a preview yet
            return $content.find( '.mw-parser-output' );
        }
    } else {
        // it may find no element, in particular if it's
        // an edit page that doesn't have a preview yet
        return $( '#mw-content-text' ).find( '.mw-parser-output' );
    }
}

to this:

function getPageContentElement( $content = null ) {
    if ( !$content ) {
        $content = $( '#mw-content-text' );
    }
    // it may find no element, in particular if it's
    // an edit page that doesn't have a preview yet
    return $content.find( '.mw-parser-output' );
}
let $pageContent;

const $parserOutput = $content.find( '.mw-parser-output' );
if ( $parserOutput.length ) { // the element that has the page content
    $pageContent = $parserOutput;
} else if ( $content.hasClass( 'mw-parser-output' ) ) { // [[phab:T349298]]
    $pageContent = $content;
} else { // no page content (history pages, special pages, etc.)
    return;
}

to this:

const $pageContent = $content.find( '.mw-parser-output' );

if ( !$pageContent.length ) {
    // no page content (history pages, special pages, etc.)
    return;
}

Of course, solving the first issue (for example by adding a hook as I suggested) would be even better, and the above codes could be further simplified, to one-liners, and more future-proof.