Page MenuHomePhabricator

defer javascript instead of async for better performance
Closed, DeclinedPublicFeature

Description

MediaWiki is currently using

<script async="" src="/w/load.php?lang=nl&amp;modules=startup&amp;only=scripts&amp;raw=1"></script>

Looking at w3schools that seems to be invalid syntax. Everywhere where I've seen the async keyword so far it's just been a simple async keyword. No = and no ="". Could that be a bug? Dunno if a separate ticket for that is worth it. What I was actually wondering about...

Could you please consider using the defer keyword rather than the async if that is better for performance. Why? While async sounds nicer, it keeps the browser more busy with parsing. With defer the JavaScript will only be executed after the HTML parsing finished. The advantage of defer is to reduce blocking.

I haven't seen this discussed before here.

Some articles I was reading on the subject and found informative:

Specifically useful I found these images:

My suggestion is inspired by PageSpeed insights.

The Critical Request Chains below show you what resources are loaded with a high priority. Consider reducing the length of chains, reducing the download size of resources, or deferring the download of unnecessary resources to improve page load.

And /w/load.php?lang=nl&amp;modules=startup&amp;only=scripts&amp;raw=1 is the initiator or another load.php JavaScript request. So it's a chain of 3. HTML document -> load.php 1 -> load.php 2.

Similar to T107399.

Edit:

To add further, potentially more authoritative sources, see this video by Google from minute 12:38 on the topic of defer, quote:

Scripts that use async may interrupt the browser from rendering other parts of the DOM.

Defer is more polite. It will also tell the browser that it can download and process all the stuff on the page, but it won't interrupt the browser to be executed. It will run after the page has been fully parsed. Defer is a great idea for anything that isn't critical for our initial viewport, things like libraries, video ...

Event Timeline

<script async="" src="/w/load.php?lang=nl&amp;modules=startup&amp;only=scripts&amp;raw=1"></script>

Looking at w3schools that seems to be invalid syntax.

It is fully valid HTML syntax, both under HTML5 and under earlier versions. When browsers download and parse the stream of HTML text from a web server, these are parsed as elements and attributes in accordance to the HTML specification. The order of attributes, the white space, and use of quotes (single quotes, double quotes, no quotes), or optional empty values on booleans, or escaping, etc do not create a different outcome of the elements at runtime.

In the same way that e.g. x = [ 'foo', 'bar' ] is the same as x=["foo" ,"bar"], so too are there many ways to serialize a given element. For example:

var div = document.createElement('div');
div.innerHTML = `<script one two="" a=fo"o b='fo"o' c="fo&quot;o">` ;
var script = div.firstChild;
script.getAttribute('a'); // fo"o
script.getAttribute('b'); // fo"o
script.getAttribute('c'); // fo"o
script.getAttribute('one'); // ""
script.getAttribute('two'); // ""

MediaWiki core Html.php generator prefers to always include attribute values and standardises on double quoted attribute values. This offers a simpler and faster backend implementation, while also enjoying maximum compatibility and portability such as with old browsers and XML-based web scrapers that would fail to parse HTML5-specific void tags or boolean attributes.

See also https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes#boolean_attributes.

Could you please consider using the defer keyword rather than the async if that is better for performance. Why? While async sounds nicer, it keeps the browser more busy with parsing. With defer the JavaScript will only be executed after the HTML parsing finished. The advantage of defer is to reduce blocking.

In a static web page, it is indeed common and generally best practice (at least to start with) to start <script> tags as early as possible, and to use defer to finish them as late as possible. This optimises network and CPU resources accordingly. For the networ, we want the browser to discover and know about all resources as early as possible so that it has full knowledge of what we need, and then it can decide in what order and with what priority to download each URL, never wasting a second with a network that is idle. Likewise, we also want to keep the CPU busy with what matters most and rendering the styles and HTML is generally more important than executing JavaScript.

I say "generally more important JavaScript", because there are exceptions. If your JavaScript is essential to the visual rendering, then you want it synchronously with as little delay as possible. I see you already found T107399, but I'll summarise here as well. It used to be that MediaWiki executed its JavaScript synchronously early in the <head>. This was good, because the JavaScript made visible changes to the page. If we delayed the JavaScript, it would make the page experience worse because the "above the fold" page would look unfinished and we would waste time invisibly rendering large articles "below the fold" before starting the JavaScript to finish the visualisation. Our exception here made sense, but it wasn't the best. It is even better if the JavaScript is not needed, and so as part of T107399, we did exactly that. We worked on our backend to render as much as possible server-side in PHP, and use CSS better, so that the JavaScript is no longer essential. After that, we could delay all JavaScript execution and this improved performance metrics such as Page Load Time by a lot. The late-arrival of JavaScript was not essential anymore. However, we choose to move from sync to async and not defer. I'll explain why below.

I also said "generally best practice" to use defer, and likewise there are exceptions here too. If the browser has knowledge of all linked resources from the start, then defer is probably best. As said above, this way the browser can make optimal use of the network bandwidth, never wasting time, and also optimal use of CPU for rendering. For a simple static HTML site, where perhaps you have 10 <script> tag, it is probably best for fast rendering to set defer on all them. But, Wikipedia is quite a bit larger than that. We have a number of unique requirements we want to achieve. Such as:

  • CDN effectiveness: The CDN should serve as many pageviews from cache as possible. We only invalidate caches if the text is modified or if expires after 7 days.
  • Deployment speed and stability: When you browse the site, it should appear and work the same throughout. E.g. not some pages with yesterday's stylesheet. We also want deployments (and software rollback) to quickly go live within ~ 5min.

You can read more about this on RL/Architecture but in a nut shell: We adopt a two-stage rocket for our JavaScript. There is the startup module that you see in <script async> on all pages. When this arrives, it has the versions/URLs of all other scripts. We then filter these through a localStorage-based optimisation, and only make requests for scripts that are needed on this page, and that are not in our cache. We use <script async> because it means: Do it in the background, but execute it as soon as it is ready. We don't want to wait several seconds until the full article below-the-fold is also ready. The browser progress bar, window.onload, PLT, etc are based on the last subresource chain settling down. If our indirect script requests start later, the page will take longer to be completed because there are new requests starting that were not originally known to the browser.

It has been a while since we last simulated the difference, but I believe this is generally more optimal as-is. An important distinction here is that this "startup" module is very small. It does not perform any slow computations, so its interruption during render is short and only has as its purpose to start other script requests.

How about a config option for third-party wikis that don't have extremely long articles and may therefore benefit from defer over async?

@Sophivorus I'm curious about what benefit you expect in this case. Can you describe (or e.g. draw on a devtools network screenshot) what activities you expect would move or shrink in the timeline?

My understanding is that using defer cannot benefit you in short articles. async is by definition equal or faster. I would agree that given a very short HTML document, or a fast download speed, the two can become indistinguishable, but I would need more details to understand how defer can actually be faster.

There exists, unfortunately, significant misinformation on the Internet in this area due to well-intended oversimplifications in the form of performance recommendations. So simplfied that the people writing those articles forget about when it does and doesn't apply, and why, as well as the space-time continuum: work cannot complete before it starts. Introducing an artificial delay where part of the browser is forced to stay idle (as deferred does), generally makes for worse CPU efficiency and throughput, and thus longer overall time to complete the page load.

If you defer the "startup" module, then the rest of the JavaScript chain would not even start until the page has finished loading, thus the browser would not discover or request the second JS bundle, and not receive the second JS bundle response until later. Thus the page will take longer overall to finish loading, because the network component of the browser was simply standing idle for no reason while the HTML parser is busy, instead of doing something useful at the same time in the background.

I can only evaluate what I know. The two main metrics we tend to measure are visual rendering speed (ie. how soon content is visually complete on the screen, as approximated by the First Contentful Paint metric), and Page Load Time (i.e. window.onload, loadEventEnd, as indicated by the progress indicator in the browser). Perhaps you're thinking about a different kind of benefit, or on a different metric?

@Krinkle Hi! Honestly, I don't remember what reasoning or tests led me to write that comment. At the time I was trying to optimize the performance of appropedia.org and was deep into the docs about async, defer, etc. but now I can't recall the nuances, about which my beliefs might very well have been mistaken. In any case, I was able to improve the performance significantly without using defer, so that's that. Sorry if I can't help, but thanks for asking!