Upgrade ResourceLoader's JS Minifier to support ES6
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Stalled | None | T333394 Vector 2022 should use details element instead of checkbox hack (IE11) for better accessibility and to fix issues with hover states on language button | |||
| Resolved | Volker_E | T288287 Remove IE11 from Basic support ("Grade C") | |||
| Resolved | TheDJ | T334235 Update VideoJS player to 8+ version | |||
| Resolved | Jdforrester-WMF | T178356 Raise Grade A JavaScript requirement from ES5 (2009) to ES6 (2015) | |||
| Resolved | Catrope | T272882 Upgrade ResourceLoader JS minifier to support ES6 |
Event Timeline
Change 593934 had a related patch set uploaded (by Catrope; owner: Jforrester):
[mediawiki/core@master] [DNM] resourceloader: Replace JavaScriptMinifier with Minify
Change 411387 had a related patch set uploaded (by Catrope; owner: Jforrester):
[mediawiki/core@master] [DNM] Benchmark comparing JavaScriptMinifier and MinifyJS
Benchmark results:
For most inputs, the new minifier (MinifyJS) is faster than the old one (JavaScriptMinifier). For some inputs, it's much much slower (e.g. oojs-ui-core, 362ms vs 2471ms) while producing output of about the same size. For some inputs, it's dramatically faster while producing much larger output, typically close to the original input size. This happens when the input is already minified, and also in some cases where it's not, so maybe there's an "is this already minified" detection heuristic that's off.
I imagine upstream would take our patches if we could work out what was going wrong. :-)
For the OOUI input case, most of the slowness occurs on this line:
// multi-line comments $this->registerPattern('/\n?\/\*(!|.*?@license|.*?@preserve).*?\*\/\n?/s', $callback);
Commenting it out speeds up minification of oojs-ui-core.js from 2372ms to 475ms.
This appears to be a regex that preserves license comments in the output. The regex looks for multiline comments that start with /*! or that contain @license or @preserve anywhere in the comment, and $callback is a function that replaces it with a placeholder. The next line replaces all multiline comments with nothing. I suspect (but haven't had the time to verify yet) that what the slow inputs have in common is that they contain many multiline comments, and that one of the non-greedy match everything (.*?) parts of this regex causes catastrophic backtracking.
I will investigate more tomorrow.
https://github.com/matthiasmullie/minify/commit/8538190f4ab21f77c938e51109547f0e943f7d44 would probably fix the slowness in that regex. Thanks for tracking that down!
Since I'm here, I figured it'd be worth pointing out a few caveats about the minifier, most of which probably won't really be a surprise:
- This grew from a few very tiny regular expression. It is a lot more complex now, but it's essentially still just a bunch of regexes. They're usually faster than any PHP userland code could parse JS syntax, but they're limited in their ability to process code (e.g. finding matching closing brackets in nested structures is something regexes are not great at...)
- Since it's just a bunch of regexes, it does not parse or validate the original or produced code, so it may silently produce invalid results.
- One of the most common ways in which it could silently fail is when the PCRE limits (pcre.backtrack_limit & pcre.recursion_limit - see http://php.net/manual/en/pcre.configuration.php) are configured too low. I've optimized the regexes a bunch and have not recently seen bug reports that indicate it still happens frequently (if at all), but it is technically still possible, given low enough limits & code written in a specific way to trigger large pcre recursion/backtracking.
- A JS-based minifier should be much superior than this (or any other) PHP-based minifier. While I started to build it for exactly the same reasons that apply to MediaWiki (be able to run with minimal fuss & additional infrastructure), it might makes sense to attempt to shell out to a JS compiler for WMF sites (if at all possible), and use a PHP-based solution as fallback, when that infrastructure is not available (for many 3rd party installs)
(I had no idea this was going on - feel free to ping me if there's anything I can help with)
Yes, that's pretty much exactly what I did (with minor modifications, will comment on github).
I'm also about to submit a PR that dramatically speeds up the minifier for large inputs by reducing the amount of string copying. In the initial benchmark, you can see that extensions/CodeEditor/modules/ace/worker-xquery.js takes 66 seconds. Profiling it showed that the majority of the time was spent in substr(). I think this is because it's a large file (3 MB) with lots of things (mostly strings) that are placeholderized. With my patch, that file takes 0.7 seconds.
Full benchmarks with my modifications (which I'll submit to github after I finish eating breakfast):
Digging into this further, @Krinkle and I found that, because MinifyJS uses regexes, it can't successfully tell whether / is a division operator or the start of a regex, even with reasonable (non-contrived) inputs. For example,
let num = (foo + bar) / baz, rx = /.global(ly)? unique(ness?)/; // incorrectly minifies to let num=(foo+bar)/ baz, rx = /.global(ly)?unique(ness?)/
Also, template strings can be nested (admittedly this is a more niche use case), and that also results in incorrect minification:
console.log( `foo${` hello + there `}bar` );
// incorrectly minifies to
console.log(`foo${` hello+there `}bar`);I believe both of these issues can't be solved without moving away from regexes in favor of a state machine. Since JavaScriptMinifier uses a state machine, I decided to work on adding ES6 support to it.
Change 664700 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/libs/Minify@master] Add ES6 support
Change 664700 merged by jenkins-bot:
[mediawiki/libs/Minify@master] JavaScriptMinifier: Add ES6 support
Change 670306 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/vendor@master] Update wikimedia/minify to v2.2.0
Change 670307 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Update wikimedia/minify to v2.2.0
Change 670306 merged by jenkins-bot:
[mediawiki/vendor@master] Update wikimedia/minify to v2.2.0
Change 670307 merged by jenkins-bot:
[mediawiki/core@master] Update wikimedia/minify to v2.2.0
I think there’s some subtle bug in the new minifier that’s preventing WikibaseLexeme JavaScript from loading correctly – there’s no error in the console or anything, but parts of the UI don’t initialize with the new minifier, except when the page is loaded in debug mode (which I assume bypasses the minifier). See T277061: Many WikibaseLexeme tests suddenly failing in quibble-vendor-mysql-php72-{selenium,noselenium}-docker.
Change 593934 abandoned by Jforrester:
[mediawiki/core@master] [DNM] resourceloader: Replace JavaScriptMinifier with Minify
Reason:
We built our own, in the end.
Change 411387 abandoned by Jforrester:
[mediawiki/core@master] [DNM] Benchmark comparing JavaScriptMinifier and MinifyJS
Reason:
We built our own, in the end.
Change 672487 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Update wikimedia/minify to 2.2.1
Change 672487 merged by jenkins-bot:
[mediawiki/core@master] Update wikimedia/minify to 2.2.1