Page MenuHomePhabricator

Vue ResourceLoader support: JavaScript in script elements parsed as HTML leading to parsing error
Closed, ResolvedPublic

Description

As https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/597852 demonstrates, simply using something that looks like HTML in your JS causes a JS error.

This can even happen with an inline comment such as:
// '<a>!!!</a>';

This is unexpected.

[0217d5a6ea867720268d76a6] /w/load.php?lang=en-gb&modules=ext.cx.eventlogging.campaigns%7Cext.echo.api%2Cinit%7Cext.popups%2CshortUrl%7Cext.uls.common%2Cinterface%2Cpreferences%2Cwebfonts%7Cjquery%2Coojs%2Csite%7Cjquery.client%2Ccookie%2CtextSelection%7Cjquery.uls.data%7Cmediawiki.ForeignApi%2CString%2CTitle%2CUri%2Capi%2Cbase%2Ccldr%2Ccookie%2CjqueryMsg%2Clanguage%2Cstorage%2Cuser%2Cutil%7Cmediawiki.ForeignApi.core%7Cmediawiki.editfont.styles%7Cmediawiki.libs.pluralruleparser%7Cmediawiki.page.ready%2Cstartup%7Cmediawiki.page.watch.ajax%7Cskins.vector.js%7Cuser.defaults&skin=vector&version=1io71   RuntimeException from line 1337 of /Users/jrobson/git/core/includes/resourceloader/ResourceLoaderFileModule.php: Error parsing file 'X.vue' in module 'mediawiki.page.ready': HTML parse errors:
Unexpected end tag : a
 on line 2
Backtrace:
#0 /Users/jrobson/git/core/includes/resourceloader/ResourceLoaderFileModule.php(345): ResourceLoaderFileModule->getPackageFiles(ResourceLoaderContext)
#1 /Users/jrobson/git/core/includes/resourceloader/ResourceLoaderModule.php(693): ResourceLoaderFileModule->getScript(ResourceLoaderContext)
#2 /Users/jrobson/git/core/includes/resourceloader/ResourceLoaderModule.php(661): ResourceLoaderModule->buildContent(ResourceLoaderContext)
#3 /Users/jrobson/git/core/includes/resourceloader/ResourceLoader.php(1182): ResourceLoaderModule->getModuleContent(ResourceLoaderContext)
#4 /Users/jrobson/git/core/includes/resourceloader/ResourceLoader.php(890): ResourceLoader->makeModuleResponse(ResourceLoaderContext, array, array)
#5 /Users/jrobson/git/core/load.php(46): ResourceLoader->respond(ResourceLoaderContext)
#6 {main}

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 598133 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] VueComponentParser: Use RemexHtml instead of PHP's HTML parser

https://gerrit.wikimedia.org/r/598133

Krinkle triaged this task as High priority.Jun 22 2020, 7:55 PM
Krinkle moved this task from Inbox to Confirmed Problem on the MediaWiki-ResourceLoader board.

Change 598133 merged by jenkins-bot:
[mediawiki/core@master] VueComponentParser: Use RemexHtml instead of PHP's HTML parser

https://gerrit.wikimedia.org/r/598133

I don't think this problem was resolved correctly. What looks like HTML in templates is ALSO not really HTML. In particular, the current ResourceLoader does not handle <table>'s correctly when there is an internal component in the table, something like:

<table><tbody> <tr><th>header...</th></tr> <internal-tr-component ...></internal-tr-component> </tbody></table>

The current parsing is pulling out the "internal-tr-component" as a separate element outside of the table. This is wrong - templates should be left alone! I think an XML parser that doesn't understand HTML at all might be best for this?

I don't think this problem was resolved correctly. What looks like HTML in templates is ALSO not really HTML. In particular, the current ResourceLoader does not handle <table>'s correctly when there is an internal component in the table, something like:

<table><tbody> <tr><th>header...</th></tr> <internal-tr-component ...></internal-tr-component> </tbody></table>

The current parsing is pulling out the "internal-tr-component" as a separate element outside of the table. This is wrong - templates should be left alone! I think an XML parser that doesn't understand HTML at all might be best for this?

Unfortunately, PHP's XML parser is basically unmaintained. But you might be able to use the Remex tokenizer and just bypass (or override) the Remex TreeBuilder component, which is responsible for fixups like hoisting non-table content out of tables.