HomePhabricator
All code is built
Build what you can before you ship

HEADER CAPTION: The head of the Statue of Liberty on exhibit at the Paris World's Fair, 1878. The statue was built in France ahead of time, shipped overseas in crates, and then assembled in New York. Image by Albert Fernique / public domain.

The process of mapping human-readable source code inputs to optimized, machine-readable outputs is called compiling or more generally, building. It's been a necessary part of software development since computers evolved past machine code. Even to serve the most abstract, high-level languages such as HTML and CSS, this build process is essential.

Just-in-time build steps

We build code all the time at Wikimedia. Every page request benefits from Less compilation, CSS and JavaScript minification, internationalization, URL mapping, and bundling build steps. All of this occurs at runtime through the ResourceLoader pipeline.

ResourceLoader's just-in-time build process is critical when key parameters vary on request. However, it has some notable limitations including:

  • Every just-in-time build step must be extremely performant, so fast that it can run on-the-fly, or our pages will load slowly. Additionally, sequential steps cannot be appended ad infinitum.
  • Effectively, ResourceLoader's just-in-time build steps can only use tools written in PHP. JavaScript execution is not possible.
  • Just-in-time build steps are less secure. They execute on production servers and serve content directly to the user. This eliminates the separation between development and runtime-only dependency trees, which can dramatically increase the attack surface, sometimes by orders of magnitude. Additionally, build outputs are shipped directly to the user without any opportunity for security review. When it comes to security, a just-in-time build step always strives to be as secure as an ahead-of-time build step that produces static outputs.
  • Just-in-time build steps are custom and complex. An ahead-of-time build step can easily be a one-liner that invokes standard tooling but the equivalent just-in-time build step, if one exists, is just as likely to be hundreds of lines of custom code. Historically, these custom steps have suffered from bus factor and received little attention beyond basic life support. Few engineers possess the abilities to write code of the caliber needed to add new build steps or change existing ones, which means the rest of Wikipedia and WMDE is blocked on their evolution. For example, we have been unable to keep pace with fundamental features like source map support (a formal request since 2013) or ES6 transpilation. In fact, there are laundry lists of missing features now standard elsewhere. The lack of standard functionality means that developing any code at Wikimedia is a completely different and far slower experience than the rest of the industry.
  • Just-in-time build step outputs have worse caching. The most advanced build step executed at runtime endeavors to have the same caching that comes out-of-the-box with an ahead-of-time build step: a plain file on disk.

An old photograph of the Ford assembly line.
CAPTION: No step in the pipeline can be delayed, and the longer the pipeline, the longer it takes to go from a nut to a new car. Image by unknown author / public domain.

Solving problems too big for just-in-time

Some problems are only solvable by just-in-time build steps. However, many solutions cannot meet the constraints of just-in-time build steps, so only a subset of all problems can be solved. This is a more general limitation of just-in-time build steps, not the ResourceLoader implementation. In practice, this means that developers cannot add a build step to the pipeline but are still left with their problem unsolved.

There must be an alternative. Our options include:

  1. Double down on building new features in ResourceLoader. This approach fails to address the fundamental limitations of all just-in-time build steps and may require reimplementing existing open-source solutions.
  2. Ship extra tooling to every user's browser and let them process it. Besides significantly increased bandwidth and computation costs that go against our mission to serve everyone, this isn't very eco-friendly, fails to solve many problems, leads to the laggy browsing experiences users so loathe on JavaScript-heavy pages, and doesn't scale far past polyfills.
  3. Replace ResourceLoader with industry standard tooling that has fewer constraints. This will require exploration, be expensive, and may have the same outcome as #1.
  4. Enhance ResourceLoader by building what we can ahead-of-time.

The first two options don't work. The third option doesn't sound like a good first choice. The fourth is the most conventional and proven solution.

Ahead-of-time build steps

Ahead-of-time build steps are usually what people think of when they refer to "building code." Most build problems that remain to be solved in Wikimedia only fit in the ahead-of-time space. As you might expect, we're using these enhancements all over the place already and can't live without them. Some examples include:

  • OOUI: Portions of this library are built with Grunt and a suite of packages from NPM for minification, uglification, and additional processing. The results are dozens of build products that are file-copied into Core manually.
  • Page Previews: This gem of a codebase is fully compiled from the latest JavaScript with Webpack. It serves about two billion virtual pageviews a month.
  • Wikibase : Ahead-of-time build tools are used by Wikibase including Webpack, TypeScript, and a plethora of other standards to serve the Wikidata communities.
  • MultimediaViewer: Commits to MultimediaViewer use ahead-of-time build steps to replace any human readable source SVGs with optimized, machine-readable outputs.
  • MediaWiki: Core uses a build step on every deployment. The process is called "a full scap." When the process fails, it's called "a full scapadapadoo."
  • MobileFrontend: All JavaScript in MobileFrontend, the heart of the mobile site, is built by Webpack. That's over 50% of all pageviews benefiting from an ahead-of-time build step using industry standard tooling.
  • Wikipedia for KaiOS: This Webpack-powered project uses a build step to serve a highly performant web app.
  • ContentTranslation: The glittering new ContentTranslation app uses the Vue CLI and standard tooling to generate the next-generation interfaces essential to serving contributors around the world. Put plainly, this is the kind of modern experience that would be impossible to build without modern tooling that leverages ahead-of-time build steps.
  • Wikipedia.org: Portals uses a build step to synchronize sister project statistics. I know someone who has a recurring task each week reminding him "it's build time." Although triggering the build step is person-powered, the outputs are what you would expect of an ahead-of-time build step: practical and project specific.
  • VisualEditor: VE is a sophisticated application that requires a build step. I don't know what this does exactly but I would guess it's solving the same kinds of problems everyone else has ahead-of-time.
  • And many more.

These ahead-of-time build steps are everywhere in Gruntfiles, Gulpfiles, Webpack configs, NPM package.json files, and shell scripts. Even if the Foundation mandated it today, we could never get rid of them.

Evolving the ResourceLoader pipeline with a new stage

Ahead-of-time build steps are the only solution for many problems, so it's fortunate they have such a proven track record of success both within and beyond the MediaWiki ecosystem. As everyone who is already using ahead-of-time build steps has discovered, they're the perfect complement to ResourceLoader's just-in-time build steps.

However, this is a problem at scale and it needs to be solved at scale. Informal developer builds work surprisingly well but aren't as efficient for developers as they could be. We need to extend the pipeline to include a pre-ResourceLoader stage. This stage is an ahead-of-time build step.

Photograph of the International Space Station in Earth's orbit.
CAPTION: The International Space Station was built on Earth in modules that were optimized for assembly and constructed in orbit. Similarly, ResourceLoader modules can be built before deployment and finally assembled in the user's browser. Image by NASA/Crew of STS-132 / public domain.

In conclusion:

  • ResourceLoader provides useful just-in-time build steps.
  • Many projects have requirements that cannot be solved at runtime. These real problems are only solvable by traditional ahead-of-time build steps.
  • Just-in-time and ahead-of-time build steps are already in use by and are for everyone, and we can't change that.
  • Ahead-of-time build steps often use standard tools but are highly project specific. These should not be centralized nor should they be constrained by artificial limitations. Per-project solution autonomy must be preserved.
  • Adding a pre-ResourceLoader stage can integrate neatly with the current ResourceLoader system by extending the pipeline to include these existing ahead-of-time workflows.

Above all, a build step means freedom. The freedom to succeed and the freedom to use the tool that's right for the job, not the rare tool that fits into a runtime-only pipeline.

Thanks to Jan Drewniak, Santhosh Thottingal, Daniel Cipoletti, Joe Walsh, Bernd Sitzmann, and Mónica Pinedo Bajo for reviewing and providing detailed feedback.

This post is also available on the Wikimedia Tech Blog.

Written by Niedzielski on Jul 28 2020, 1:27 PM.
Programmer
Projects
None
Subscribers
Esanders, darthmon_wmde, bearND and 4 others
Tokens
"Yellow Medal" token, awarded by mmodell.

Event Timeline

Some clarifications:

VE is a sophisticated application that requires a build step.

VE does not require a build step. The build step you link to is for standalone demos, but VE in MediaWiki only uses ResourceLoader.

OOUI: Portions of this library are built with Grunt and a suite of packages from NPM for minification, uglification, and additional processing. The results are dozens of build products that are file-copied into Core manually.

Again this is only for non-MW users. The uglified output is not used by MW and OOUI could also be served from source via ResourceLoader (but the build output already exists so it is used).

My thoughts:

Many projects have requirements that cannot be solved at runtime. These real problems are only solvable by traditional ahead-of-time build steps.

The number of extensions that genuinely can't work right now without an ahead-of-time build step is actually a very small. Given our limited resources, the question is whether we should invest in a new build pipeline that is secure and compatible with the MW ecosystem, or in incremental improvements to RL to meet specific requirements. I think we need to be open to both of these options.

Thank you, @Esanders.

VE does not require a build step. The build step you link to is for standalone demos, but VE in MediaWiki only uses ResourceLoader.

Thanks for pointing this out. It sounds like there's more uses for a build step than just production outputs!

Again this is only for non-MW users. The uglified output is not used by MW and OOUI could also be served from source via ResourceLoader (but the build output already exists so it is used).

Ooh, I didn't realize that. That's very interesting that OOUI has opted into a build step.

The number of extensions that genuinely can't work right now without an ahead-of-time build step is actually a very small.

Makes sense. The grade C browser experience, for example, doesn't need any of these steps.

Given our limited resources, the question is whether we should invest in a new build pipeline that is secure and compatible with the MW ecosystem, or in incremental improvements to RL to meet specific requirements. I think we need to be open to both of these options.

The runtime portion of the pipeline already exists and so do the ahead-of-time build steps. Upon reflection, I actually see moving the build step into CI instead of locally as the "incremental improvement to ResourceLoader" you mentioned.