Page MenuHomePhabricator

RFC: Add a frontend build step to skins/extensions to our deploy process
Open, Needs TriagePublic

Tokens
"Like" token, awarded by WMDE-leszek."100" token, awarded by nray."Love" token, awarded by Niedzielski."Like" token, awarded by Jhernandez."Yellow Medal" token, awarded by pmiazga.
Assigned To
None
Authored By
Jdlrobson, Jul 6 2018

Description

Related: T107561

Problem

Currently when deploying MediaWiki code at WMF, Scap creates a single directory with the contents of MW core and all all installed skins/extensions, stripped of all the git metadata and we blindly rsync it to the application servers.

There is not currently a process in place somewhere between approving a commit in Gerrit and it being live on WMF application servers, that would allow for an extension or skin to build derivative files or inject additional files from third-party sources (e.g. npm packages).

The absence of such a build step, means skins and extensions must workaround this by ensuring all code needed in production is committed to the Git repository.

Many tasks developers perform are monotonous and can be automated. However, these automation tasks currently cannot happen as part of the deployment process. Instead, they are run locally or by bots, with their output always being committed in the same repository. This can be confusing to contributes, prone to human error and demoralising for developers as it regularly leads to merge conflicts (sometimes on every single commit) .

As long as we do not have a build step in our deploy process, we are:

  • are discouraged from making use of existing tooling without modification that could lead to potential optimisation in our code but would change the resulting code on every commit
    • e.g. better minification, dead code elimination, Node.js templating libraries, transpilers, SVG minification (see workarounds)
  • exposing ourselves to avoidable errors at all times (see examples below)
  • unable to move fast.
    • are forced to commit assets to master and experience merge conflicts with every patchset
    • forced to run scripts manually that are slow and could be delayed till deploy-time (e.g. minification of image assets)
  • Wasting time reinventing the wheel in PHP/build processes (see ResourceLoaderImageModule and Discovery, Gruntfile tasks, deploy-bot for examples - see workarounds)

Background

Ten years ago, build steps were rare, in particular in frontend development. Frontend development involved managing a small amount of assets providing modest enhancements to a page. Over time, JavaScript has become a more essential part of websites, MediaWiki itself makes much use of JavaScript.

The most common use case for a build step ten years ago, was to concatenate files, one of the requirements of that the original ResourceLoader was built for and does very well. However, as time has passed, build steps have been used for module loading (see http://requirejs.org/) and more recently performance optimisations.

Build processes are now commonly used to do all sorts of task - such as optimise/minify SVG assets; handling support for older browsers; making use of foreign dependencies via a package management system; dead code elimination and minification [1]

Every frontend engineer is expected to understand and make use of build processes and in fact it is considered a red flag if one doesn’t.

Frontend development inside MediaWiki proves far more difficult and can be a frustrating affair. The typical learning curve is 3 months for a well-experienced frontend developer to get up and running in our ecosystem given the unfamiliarity of our tooling and code choices.

Since the introduction of composer in MediaWiki, our code has felt much more manageable and shareable with the outside open source world. Many build steps have been rendered unnecessary e.g. Wikidata, but no frontend equivalent exists

Since we generalised our CI to run npm test we empowered developers to have more flexibility in what and how we test.

Similarly, by adding a job to deploys and giving power to developers to control the way code that gets deployed, I believe, we will see similar innovation.

[1] Although ResourceLoader provides minification, the JS implementations of minification tend to result in more bytes savings than our PHP implementation.

Why not having this is a problem

The lack of support for a build step in MediaWiki has led to many imperfect workarounds and clearly broken experiences across our codebase that are in dire need of fixing.

Some examples:

  • The Popups extension makes use of webpack to build and test complex JavaScript code. It has a test script that forces the developer to run a build step locally and commit the output. This leads to merge conflicts on every single patch in the repository, causing engineers to make tedious and mechanical corrections.
    • Despite this pain point, this build process was considered one of the main reasons we successfully, shipped this software by providing us space to move confidently and quickly.
    • MobileFrontend and Minerva skin(which power our mobile site) will soon follow the example of Popups and hit the exact same problems.
    • Challenges with Popups are documented in this phabricator task: https://phabricator.wikimedia.org/T158980
  • The Kartographer maintains a local copy of Leaflet, which was incorrectly edited by engineers unfamiliar with the code layout, and then overridden when source files were built via the build script. This has recently been corrected via a grunt test script run on every commit.
  • Flow which makes use of a build script to build server side templates. Developers modifying Handlebars templates must commit compiled templates. It is common for these not to be committed and for invalid code to be deployed to our cluster.
  • The Wikimedia portal uses a build step. Currently, this is taken care of by a bot which deploys built assets prior to deployment. In between these runs, the deploy repo lives in an outdated state.
  • Various repos make use of SVGO to compress SVG assets. See https://phabricator.wikimedia.org/T185596. This is enforced via an npm job, but could easily be handled by a build step, avoiding the need for a human to run npm install and run it themselves.
  • In MediaWiki core external libraries are copied and pasted into the resources/lib folder. This is done manually (https://github.com/wikimedia/mediawiki/commit/065b21b4bdd49ffca494ff905de29bfc500ed535). These files do not actively get updated, but if they did, a build step could help them live elsewhere.
  • The Wikidata Query Service GUI has a build process that compiles dependencies needed to display complex graphs and maps.

From what I can see there are various other (undocumented?) build processes in the ecosystem. If you are familiar with these and how a build process might help you, feel free to update these and move them into the list above.

  • Wikibase had(s) a complicated build process which was moved to composer that I am unable to understand.
  • VisualEditor uses a build step.
  • OOjs UI uses a build step.
  • LinkedWiki has a build step and doesn't check in its assets meaning it is not compatible with CI (see T198919)
  • Wikistats2 has a build process and must build and release static assets in additional commits to master example 1 example 2.

Workarounds

Right now our workaround appears to be “rebuild in php” or “force developers to run build steps locally and enforce with Jenkins”. In the case of managing SVG assets, we created the ResourceLoaderImage module and in the case of LESS compiling we introduced a PHP less compiler. In the case of compiling templates we use LightnCandy (but for the client, due to lack of a better solution and lack of a build step, offload compiling unnecessarily to our users).

While this approach is fine, when libraries already exist, this is not always practical. In the case of the LESS compiler, we use an outdated LESS compiler which has already needed to be replaced and there is more risk that these projects become unmaintained given many of these projects are moving to the Node.js ecosystem.

Enforcing with Jenkins from experience, appears to be difficult to do at scale. In the case of SVG minification various solutions exist across different repositories and standardisation and generalisation has been tricky and become blocked due to an inexperience of the many frontend engineers building these tools with our CI infrastructure. The approach with SVG minification involves adding a check for whether SVGs are minified and committed rather than simply running the optimisation on existing code in the repo.

All workarounds create unnecessary slowdown and takes focus away from actually building.

Motivation

  • Having a build step for Node.js would allow us to
  • Offload optimisations from the developer to automation
  • Simplify software development by empowering use of JavaScript tooling such as rollup, webpack, optimising scripts (such as svg minifiers, UglifyJS or babel minify), and up to date tools like the canonical LESS compiler.
  • Remove various “check” scripts in place in a variety of extensions which would be unnecessary
  • Avoid the need for committing assets into extensions which are built from other components within the extension making our code more readable. It is often difficult to distinguish built assets from manually written assets and README files are only somewhat effective.
  • Make it easier for newcomers to understand the code in our projects and participate in development
  • Where templates are used in our projects, compile templates as part of a build step, rather than offloading this task to the user
  • Empower developers with tools to create more efficient and better frontend interfaces by giving them the same kind of tooling and flexibility that the PHP backend leverages

Requirements

  • Build step should allow the running of an npm command prior to deploy.
  • Build step should be discoverable and consistent in every repo:

For instance I should be able to expect a build process (if available) to live inside npm run build, npm run deploy, npm run postinstall

  • Build steps need to be run locally for every single extension and skin that have registered one.
  • Build step must run on:
  • Any CI job - currently Selenium, QUnit will not run any build steps and go back what code is deployed in the repo. While the latter can be worked around using a post-update-cmd inside composer, this is not run by the former.
  • Before a deployment we would need to run this to ensure the correct code is deployed to production
  • Before a SWAT deploy on repos that have a build process we’d need to re-run the build step as content may have changed
  • Before bundling a release we would need to run this to ensure that 3rd parties are not required to install dependencies and run build processes.
  • Vagrant instances would need to be aware of any install steps as part of vagrant git-update.
  • Any errors within a build step should be made visible to whatever runs it. Build steps are expected to use error codes.

Note: For sysadmins supporting 3rd party wikis using git, given users can be expected to run php maintenance/update.php as part of many installs, it seems reasonable to expect they can also run npm install && npm run build

FAQ

I’m a sysadmin not a coder. I don’t want a build step as this makes development harder.
If you are using MediaWiki’s tar releases, you will not notice anything different. When releases are bundled, we’d do the build step for you.

Won’t https://phabricator.wikimedia.org/T133462 fix this?
This will be extremely useful and provide a more standard way to load files, but it will not allow anything further than that. This will provide greater freedom into how users build code, but it will not allow things like transpiling TypeScript or transpiling modern JavaScript that would be enabled by allowing the greater freedom of a build step.

Can’t we just recreate the tooling you are using in PHP?
For some of it, yes. In fact, we added some extremely powerful and useful PHP tooling for svg handling inside MediaWiki ResourceLoader. This is however costy as it requires a frontend developer to convince a backend developer to implement said tooling, and there are usually options readily available and more well maintained in the npm ecosystem e.g. https://www.npmjs.com/package/svg-transform-loader. Building this out took time.

While possible, something like transpiling ES6 JavaScript to ES5 would be a little crazy to implement in PHP but is available “off the shelf” in the npm ecosystem.

Introducing a build step introduces more that can go wrong in our pipeline. What if NPM is down?

Sure, there will definitely be things to work out here and they will be taken care of on a case by case basis. We should be reassured that npm is more resilient these days, but this is not a problem unique to our ecosystem. A build step could be written to fetch a recent stable release of code from another repository if necessary.

I don't want to introduce foreign code from 3rd parties via npm
This is happening already in skins/extensions. Having a central build step won't change this.

What if build scripts pull down code malicious packages from npm modules?
We'll review https://eslint.org/blog/2018/07/postmortem-for-malicious-package-publishes but to begin with (particularly with regard to package-lock (T179229). We will need some way to manage this, while still getting the benefit of automation, for instance we could limit outgoing HTTP requests and run our own private mirror of package managers we need.

I don't want to run npm jobs on every git checkout!
It's unlikely you will need to. If vagrant is being used and we had a a standard approach, vagrant would be automated to take care of this for you. Built assets although maybe outdated are likely to be usable unless incompatible changes have been made in PHP. If you are working on JavaScript code that uses a build step, then you would be using that same build step for development.
One possible compromise, to mitigate the pain here could be limiting build steps to skins and extensions (ie. not core), so that if people want to avoid a build step altogether they are not forced to do so.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 6 2018, 9:56 PM
Jdlrobson updated the task description. (Show Details)Jul 6 2018, 9:58 PM
Krenair added a subscriber: Krenair.Jul 7 2018, 1:09 AM
sietec added a subscriber: sietec.Jul 7 2018, 11:25 PM
Jdlrobson moved this task from Inbox to Next up on the User-Jdlrobson board.Jul 12 2018, 10:57 PM
nray added a subscriber: nray.Jul 12 2018, 11:25 PM
Jdlrobson updated the task description. (Show Details)Jul 13 2018, 6:07 PM
Jdlrobson added subscribers: Mooeypoo, Krinkle, Catrope and 3 others.

I'm adding these tags to get more eyeballs and hear concerns. Note, I'll be traveling for some of July (like many others) so it may be best to delay any conversations until August. Looking forward to some async/sync conversation!

greg added a subscriber: greg.Jul 18 2018, 6:08 PM
tstarling moved this task from Inbox to Backlog on the TechCom-RFC board.Jul 18 2018, 8:46 PM
Milimetric added a subscriber: Milimetric.EditedJul 18 2018, 9:11 PM

Here's a quick summary of what we do with webpack for Wikistats 2. I realize it may not be relevant, but maybe some of our experience is applicable or at least useful:

  • we develop on master, and have webpack set up to watch and build into /dist-dev as we code (and test with karma), we don't commit /dist-dev
  • we build into /dist for production, and commit that to the repo
  • webpack sticks the git sha1 into the name of the files, like main-aef9966.js or main-aef9966.css, so that client-side caching isn't a problem
  • we have jenkins build and run our karma tests
  • we git merge -X theirs --squash master from master to the release branch
  • we have puppet ensure => latest from the release branch to the production server

Our docs are here: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Wikistats_2#Releasing_a_new_version_to_production

Anomie added a subscriber: Anomie.Jul 23 2018, 3:16 PM

As a developer, I find myself a bit skeptical of this proposal.

I find I seldom have to run update.php. Schema changes are a big deal, so we don't have many of them

I also find I seldom have to run composer update. Our policy of version-pinning and the need to update mediawiki/vendor in tandem seems to keep down the number of breaking changes in there too.

But this is proposing something that seems planned like it'll need to be run after every git checkout. Maybe such repos should use git hooks of some sort to enforce that, with the side effect of urging developers of such extensions to keep the build step fast.

I also can't say I've had much good experience with npm stuff in general, from crazy-deep dependency trees where it takes forever just to check for updates, to oojs-ui (at one point anyway) seeming like it was downloading an entire headless build of chromium for who-knows what purpose in order to run its build, to the fact that https://bugs.debian.org/794890 was open for a long time and only now seems to be being fixed.

I’m a sysadmin not a coder. I don’t want a build step as this makes development harder.
If you are using MediaWiki’s tar releases, you will not notice anything different. When releases are bundled, we’d do the build step for you.

That seems a bit shortsighted, assuming that everyone who's a coder doesn't mind installing build tools for every extension that needs testing and that everyone who's not a coder is happy to use release tarballs.

Jdlrobson updated the task description. (Show Details)Jul 26 2018, 1:42 AM
Jdlrobson updated the task description. (Show Details)Jul 26 2018, 1:46 AM
Jdlrobson updated the task description. (Show Details)Jul 26 2018, 1:51 AM
nray awarded a token.Jul 26 2018, 1:54 AM
nray rescinded a token.
nray awarded a token.

I think it would be easier to discuss this if we dropped the language agnostic part of this and just dealt with a frontend build step, since that's really the main motivation behind this (AIUI) and where most of the problems lie. Technically with composer there are post-install hooks (I think npm might have these too?) so we already run stuff like ComposerVendorHtaccessCreator.php as you run "composer install". But anyways, I don't see any concrete usecase for making this language agnostic - let's just discuss frontend.

My most recent deep experience with npm was as part of T107561: MediaWiki support for Composer equivalent for JavaScript packages, and I was left extremely disappointed. Out of the 5-6 libraries that we use, one of them had uploaded something different to npm than what was in git, another was missing git tags, and then another was not distributing any licensing information at all. I think including more npm libraries is going to be more work to make sure we can check off the necessary items before bringing in new external libraries than it is for composer libraries.

So the main concerns I can think of right now:

  • Auditability: how are we going to audit the code we deploy? In PHP we do this by reviewing patches to mediawiki/vendor (an imperfect solution to be clear).
  • Availbility of tooling: npm is not available in Debian stable - one of the operating systems we give first class support to. How are we going to deal with that?
  • Extension support: We had to develop the composer-merge-plugin to get extension dependencies to work. How are we going to handle this for npm libraries?
  • Patching: What happens if we need to patch something (potentially a security patch) immediately, without having the upstream maintainer do a new release?

@Legoktm Good points. Also keep in mind that an important factor here is transformation of original code as generated using npm packages, not the inclusion of npm packages themselves. E.g. transpilation with Babel and bundling with Webpack.

That is the main reason why I haven't suggested a more straight-forward approach using mediawiki-vendor (or something like it). The transformation of original code is a more difficult problem to solve and would need to also consider lots of principles that we currently uphold with Git but would lose if not carefully considered:

  • Deterministic deployment. Be able to have the same code run locally, in CI, beta, prod, and even after reverting/re-deploying in prod.
  • Fast deployment. – Processing happens ahead of time, or at run-time, not during deployment.
  • Secure deployment. – All run-time code can be inspected in Git and uniquely identified by a commit hash.

I suspect the only feasible way forward that upholds those values is to make this dependant on the on-going cross-departmental project to deploy MediaWiki using immutable container images that contain all relevant source code within (e.g. using Docker containers, or similar). Those images would need to be securely built from a continuous integration/delivery pipeline, inspectable, testeable, and stored somewhere. See T170453 (RelEng/SRE goal) and its many sub tasks.

It might me possible to achieve without CI/CD, but that would require a significant amount of dedicated resources from many teams that I don't think we can spare any time soon – given the number of projects we've already committed to that are prerequisites to other future projects.

I talked to @Samwilson last week and he is also running into similar problems in that they are committing built assets into a repo. However since that code is hosted on github there are a few Travis CI solutions there which don't exist for code deployed via train.

The Wikimedia portal uses a build step. Currently, this is taken care of by a bot which deploys built assets prior to deployment. In between these runs, the deploy repo lives in an outdated state.

@Jdrewniak could you provide some notes (via edits to the description) on how the portal repo is setup? I think this would be useful.

I think it would be easier to discuss this if we dropped the language agnostic part of this and just dealt with a frontend build step, since that's really the main motivation behind this (AIUI) and where most of the problems lie. Technically with composer there are post-install hooks (I think npm might have these too?) so we already run stuff like ComposerVendorHtaccessCreator.php as you run "composer install". But anyways, I don't see any concrete usecase for making this language agnostic - let's just discuss frontend.

I'm fine with limiting the discussion to frontend.

My most recent deep experience with npm was as part of T107561: MediaWiki support for Composer equivalent for JavaScript packages, and I was left extremely disappointed. Out of the 5-6 libraries that we use, one of them had uploaded something different to npm than what was in git, another was missing git tags, and then another was not distributing any licensing information at all. I think including more npm libraries is going to be more work to make sure we can check off the necessary items before bringing in new external libraries than it is for composer libraries.

I'm not sure I follow here. If we were using npm, we'd have more confidence in the code we're pushing and it would be clearer what version number we are on.
Right now we have to rely on the header of ~/git/core/resources/lib/jquery/jquery.js to know the version, but I have no confidence that is right as there's a chance of human error. If package.json stated "jquery": "3.3.1, that goes away.

That said, while this would be empowered by a build step, it's not necessary for this RFC. We could run our own npm proxy if necessary or commit the node_modules folder if we really wanted to. A build step might be a node.js script that simply compresses SVGs for instance and would be a good step in the right direction. A script like https://github.com/wikimedia/mediawiki-extensions-MobileFrontend/blob/master/dev-scripts/svg_check.sh for instance could be turned into a svg minifier that run on build/deploy rather than an svg checker that runs on every CI job.

So the main concerns I can think of right now:

  • Auditability: how are we going to audit the code we deploy? In PHP we do this by reviewing patches to mediawiki/vendor (an imperfect solution to be clear).

With node 8 we can upload package-lock.json and make use of npm audit. It's my understanding we have no protection right now, so this IMO is a huge improvement.

  • Availbility of tooling: npm is not available in Debian stable - one of the operating systems we give first class support to. How are we going to deal with that?

I don't know enough about this problem so comment, but surely other people are encountering this issue and we're not the first.

  • Extension support: We had to develop the composer-merge-plugin to get extension dependencies to work. How are we going to handle this for npm libraries?
  • Patching: What happens if we need to patch something (potentially a security patch) immediately, without having the upstream maintainer do a new release?

We fork the library and patch it. Currently I have no idea what we do for external libraries inside the core/resources/lib - do we add patches on top of that or edit the code itself? I doubt we have any protection there, so this doesn't seem like a new problem to me.

How can we move this conversation forward? Can I request TechComm schedules a discussion on this RFC this with at the least the following attendees: @Legoktm @Jdrewniak @Krinkle @Jhernandez @Milimetric @Niedzielski @Samwilson ?

Joe added a subscriber: Joe.Jan 10 2019, 11:51 AM

Please note that the statement

Currently when deploying code we sync a git repo.

is actually incorrect. We create a directory stripped of all the git metadata and we blindly rsync it to the application servers.

The git repositories are synced to the deployment hosts, but we don't use git in any way for our deployment process (and I would say we should).

Joe added a comment.Jan 10 2019, 12:34 PM

I suspect the only feasible way forward that upholds those values is to make this dependant on the on-going cross-departmental project to deploy MediaWiki using immutable container images that contain all relevant source code within (e.g. using Docker containers, or similar). Those images would need to be securely built from a continuous integration/delivery pipeline, inspectable, testeable, and stored somewhere. See T170453 (RelEng/SRE goal) and its many sub tasks.

I don't think there is a direct dependency between the two; I think the efforts might go along in parallel if we want to add a build step to deployments. You can build the artifacts in a container and export them, for instance, and that's well supported by our CI infrastructure AIUI. The problem I see is that implementing a pipeline that upholds the values you cited will require a lot of effort anyways.

Some various considerations:

  1. I agree with @Legoktm - it's easier to get to a consensus if we narrow the scope of the proposal to the frontend javascript code
  2. A model similar to what @Milimetric was mentioning might work, but I miss to many details to provide a feedback. I still think the RfC should be a little less light on the technical details of the proposal once we've agreed on what we want to achieve.
  3. Reproducibility and auditability are to be guaranteed by a CI pipeline that generates binary artifacts, that will then be fetched (via git) either for deployments via scap or inside containers (in the future). This is a tall goal to achieve with the npm ecosystem, and will probably involve multiple teams.
  4. It might make sense to have a single build pipeline or to have a separate one per-extension and just standardizing how we build the artifacts. I would tend to think that the former approach is more manageable.
  5. Managing reverts could be an issue, that needs to be better understood
  6. The binary artifacts build should only happen when needed, or we'd need to change the way we deploy code before we can do it (think of pooling/depooling of databases, or SWAT, when multiple small patches get deployed sequentially and sometimes fast-reverted)

It might me possible to achieve without CI/CD, but that would require a significant amount of dedicated resources from many teams that I don't think we can spare any time soon – given the number of projects we've already committed to that are prerequisites to other future projects.

What I listed above can only happen if a proper CI pipeline is set up, and I suspect this would require a non-negligible amount of time from a few teams, most notably Release Engineering and in general the infrastructure teams (SRE, Core Platform, Performance). So resourcing seems to be a real concern before this proposal can be considered feasible.

Jdlrobson updated the task description. (Show Details)Jan 10 2019, 4:05 PM
Jdlrobson renamed this task from RFC: Add a language agnostic build step to skins/extensions to our deploy process to RFC: Add a frontend build step to skins/extensions to our deploy process.Jan 10 2019, 4:09 PM
Jdlrobson updated the task description. (Show Details)

I just want to re-iterate our simpler way of handling build steps in Wikistats 2. In light of the problems described here, I think it deserves a second look. It's pretty standard on other nodejs projects I've looked at. We basically factor out most of our common build config and use it in two kinds of builds:

  1. npm run dev: builds source files and any dependencies. Sets up webpack-devserver to watch for file changes and rebuild. Outputs to a directory that's ignored by git and used during local development.
  2. npm run build: builds source files and any dependencies, including production-oriented transformations like minifying. Outputs to a dist directory that runs in production, and is committed to git. When we do this, we:
    • build, modifying only the dist directory
    • test the new build locally
    • push the new dist to gerrit, by itself with no other source modifications, where CI tests it
    • merge and deploy

With the dev build, we develop in peace without worrying about the build system. When we want to release we go through the process above. The release commits are separate from any other commits and so can be reverted easily. Of course this wasn't meant for the mediawiki ecosystem, but I've sat here thinking about most of the problems that you all raise above and I think solutions exist for each of them. The trickiest bit in my mind is de-duplicating dependencies across multiple extensions and core, but I've got a few ideas and both npm and webpack have dealt with this problem. The only other thing that stood out was npm not existing on Debian 9, which I am at a loss about. Like... why...

With the dev build, we develop in peace without worrying about the build system. When we want to release we go through the process above. The release commits are separate from any other commits and so can be reverted easily. Of course this wasn't meant for the mediawiki ecosystem, but I've sat here thinking about most of the problems that you all raise above and I think solutions exist for each of them.

The problem with this workflow is that then your acceptance/browser tests are running against the last deployed frontend assets. Which is fine, but then once you build you may get browser tests that fail and it is hard to pinpoint which commit broke them.

Another issue is when server code needs the updated frontend code/assets. You risk not running the dist scripts etc. at the right time and deploying broken code in subtle ways.

To avoid these, you would make your integration/acceptance jobs run the dist job locally before running the tests, and at that point we are talking again about having a standard way to run the dist job on CI jobs. Not on deploy, which maybe makes things easier RFC wise, but sort of the same concept.

The trickiest bit in my mind is de-duplicating dependencies across multiple extensions and core, but I've got a few ideas and both npm and webpack have dealt with this problem. The only other thing that stood out was npm not existing on Debian 9, which I am at a loss about. Like... why...

Tricky indeed, but we have the same problem right now without any solution. We can move forward with having build systems having the same solution we use now: Manual inspection and care from developers. And maybe in the future we can streamline this some other way. Maybe a linter that checks for shared dependencies across extensions or something low cost that would help rise duplicate dependencies. Maybe even streamlining frontend build-steps into mediawiki core itself!

The problem with this workflow is that then your acceptance/browser tests are running against the last deployed frontend assets. Which is fine, but then once you build you may get browser tests that fail and it is hard to pinpoint which commit broke them.

You can make both your local tests and CI tests run against the dist-dev version of your build, the one you use locally for development. And you could have a separate CI job testing the dist version for when that's updated on release commits. The CI job running against dist-dev would just run the local build, or even a third build that's very similar but meant for headless testing (so, for example, no webpack-devserver).

Another issue is when server code needs the updated frontend code/assets. You risk not running the dist scripts etc. at the right time and deploying broken code in subtle ways.

You never deploy intermediate commits, only release commits. This can be accomplished by segregating release commits onto a separate branch if needed. So if you are mixing server and frontend code in the same repository, you build both. At the very least you should test both as I was brainstorming above.

To avoid these, you would make your integration/acceptance jobs run the dist job locally before running the tests, and at that point we are talking again about having a standard way to run the dist job on CI jobs. Not on deploy, which maybe makes things easier RFC wise, but sort of the same concept.

I agree, but don't see problems with doing any of the above as part of CI jobs.

Tricky indeed, but we have the same problem right now without any solution. We can move forward with having build systems having the same solution we use now: Manual inspection and care from developers. And maybe in the future we can streamline this some other way. Maybe a linter that checks for shared dependencies across extensions or something low cost that would help rise duplicate dependencies. Maybe even streamlining frontend build-steps into mediawiki core itself!

Hm, I thought ResourceLoader helped a bit with it. But if we have a similar issue, then yes, definitely, let's standardize on webpack or similar. I was just googling around and found two different solutions that are much better than manual inspection.

How can we move this conversation forward? Can I request TechComm schedules a discussion on this RFC this with at the least the following attendees: @Legoktm @Jdrewniak @Krinkle @Jhernandez @Milimetric @Niedzielski @Samwilson @Joe ? I think a conversation in real time would be useful here!

VisualEditor uses a build step.

Only the demo of the standalone VisualEditor uses it (the VisualEditor/VisualEditor repo), it mostly involves updating some HTML files to load newly added JS/CSS files. The VisualEditor MediaWiki extension does not have a build step.

OOUI uses a build step.

I really wish it did not. The build process makes testing OOUI patches with MediaWiki a major chore.

The build step only involves concatenating JS files, compiling Less files, and generating colored icon variants, all of which are also available in MediaWiki's ResourceLoader. It could probably be refactored to avoid the build step.

OOUI uses a build step.

I really wish it did not. The build process makes testing OOUI patches with MediaWiki a major chore.
The build step only involves concatenating JS files, compiling Less files, and generating colored icon variants, all of which are also available in MediaWiki's ResourceLoader. It could probably be refactored to avoid the build step.

On the other hand, I would love it if OOUI was more modular and used a configurable build step to adapt to different projects. That would help us adopt it outside of mediawiki.

Jdlrobson updated the task description. (Show Details)Feb 14 2019, 6:30 PM
Jdlrobson moved this task from Backlog to Request IRC meeting on the TechCom-RFC board.
Krinkle updated the task description. (Show Details)Feb 24 2019, 10:39 PM

I've made some changes to the initial problem statement to:

  • more accurately reflect how current deployments work,
  • document in slightly more generic way what the current pain points are (less biased towards a particular solution),
  • document more clearly what the currently problems are as perceived/experienced by Reading Web.
Task description

[..] we are:

  • severely limited in our choice of tooling,
  • [..]

@Jdlrobson Can you elaborate on this point?

I might be misunderstanding it, as I do not see how the current workflow limits your choice of tooling. You can currently choose any programming language, version thereof, and any combination of upstream packages installed from wherever. The only requirement is that your team internally agrees and (ideally) that it is documented in the repo and reproducible by other contributors. It's unlikely we'll be able to meet that level of flexibility in production, but let's suppose we do. What additional flexibility would you like to see in a deployment pipeline?

Jdlrobson updated the task description. (Show Details)Feb 26 2019, 12:44 AM
Jdlrobson moved this task from Next up to Blocked on the User-Jdlrobson board.Feb 28 2019, 8:59 PM

TechCom is hosting a discussion of this on March 20 2pm PST (21:00 UTC, 22:00 CET) in #wikimedia-office

brennen added a subscriber: brennen.Mar 7 2019, 4:14 PM

I don't want to run npm jobs on every git checkout!
It's unlikely you will need to. If vagrant is being used and we had a a standard approach, vagrant would be automated to take care of this for you.

This assumes you are using vagrant, which is too slow for many of our developers.

Related to this issue, not having to have a watch script running while I develop is a huge benefit of ResourceLoader. As projects grow and the build step becomes slower (cf OOUI where "quick-build" takes well over 10 seconds), resulting in a much slower development process. Quoting Bartosz above: "I really wish it did not. The build process makes testing OOUI patches with MediaWiki a major chore."

IMO, removing/not adding build steps would have a much greater effect on developer productivity than allowing some limited ES6 syntax.

Anomie added a comment.Mar 7 2019, 7:20 PM

Besides what I said earlier in T199004#4445485,

We should be reassured that npm is more resilient these days

And yet npm flakiness making CI fail is not an uncommon occurrence.

I don't want to run npm jobs on every git checkout!
It's unlikely you will need to. If vagrant is being used and we had a a standard approach, vagrant would be automated to take care of this for you.

This assumes you are using vagrant, which is too slow for many of our developers.

+1.

Related to this issue, not having to have a watch script running while I develop is a huge benefit of ResourceLoader. As projects grow and the build step becomes slower (cf OOUI where "quick-build" takes well over 10 seconds), resulting in a much slower development process. Quoting Bartosz above: "I really wish it did not. The build process makes testing OOUI patches with MediaWiki a major chore."

+1 to this too.

Can we add to the requirements that there must be a way to easily use non-minified assets for development and debugging? Besides the slowness just mentioned, I've several times tracked down a UI bug because I can simply append &debug=1 to the URL and ResourceLoader will serve me JS I can actually read rather than a minified blob I'd immediately give up on. Yes, some browsers these days have "pretty print source" in their debuggers, but at least Firefox 65's is barely adequate at best.

The Popups extension makes use of webpack to build and test complex JavaScript code. It has a test script that forces the developer to run a build step locally and commit the output. This leads to merge conflicts on every single patch in the repository, causing engineers to make tedious and mechanical corrections.

Backported fixes also have a guaranteed conflict which is nonideal as changes to old branches are already often dicey without any additional complication.

Various repos make use of SVGO to compress SVG assets. See https://phabricator.wikimedia.org/T185596. This is enforced via an npm job, but could easily be handled by a build step, avoiding the need for a human to run npm install and run it themselves.

This sucks because presently we either have to maintain a human readable SVG and a duplicate optimized one or mentally decipher the optimized version.

In MediaWiki core external libraries are copied and pasted into the resources/lib folder. This is done manually (https://github.com/wikimedia/mediawiki/commit/065b21b4bdd49ffca494ff905de29bfc500ed535). These files do not actively get updated, but if they did, a build step could help them live elsewhere.

I'll also add that it can be difficult to determine what version was copy and pasted (e.g., T205128).

In MediaWiki core external libraries are copied and pasted into the resources/lib folder. This is done manually (https://github.com/wikimedia/mediawiki/commit/065b21b4bdd49ffca494ff905de29bfc500ed535). These files do not actively get updated, but if they did, a build step could help them live elsewhere.

Note that this isn't true for our fourteen most important external library dependencies now (and I'm slowly replacing the rest); we use php maintenance/resources/manageForeignResources.php verify to assert that they match the integrity checks, and …update to pull new versions from upstream.

TheDJ added a subscriber: TheDJ.Mar 14 2019, 8:57 PM

Can we add to the requirements that there must be a way to easily use non-minified assets for development and debugging?

+1 from me on that.

Can we add to the requirements that there must be a way to easily use non-minified assets for development and debugging?

+1 from me on that.

Easy debugging sounds great to me too and already exists in Popups. For example, here's me debugging production enwiki without any special extensions, settings, or source in Chromium:

(.gif available here due to miserly Phabricator upload limits.)

@Niedzielski Debugging with source maps is certainly better than without them, but it is worse than debugging by actually running the unminified code. In particular, my biggest issue is how certain minifications changing the structure of the code affect where you can place breakpoints and how "step over next statement" functionality works.

For example, consider the simple snippet:
if ( foo() ) { bar(); baz(); }
Webpack will compile this into:
foo() && bar(), baz()

The three statements have been changed into one. Because of this, even if you're looking at the original code via a source map:

  • You can't place a breakpoint before bar() or baz(), only before the if()
  • If you have a breakpoint before if(), pausing there and using "step over next statement" will skip over entire if(){} block, rather than pause before bar() (assuming the if condition is true)

This can even be seen in your screenshot – the lines where the line numbers are greyed out are lines where you can't pause or place a breakpoint. (And in my experience, whether placing a breakpoint on other lines actually works is anyone's guess.)

@matmarex, thanks for this well-worded distinction Popups only has support for unminified builds locally. We could easily supply unminified builds in addition to minified. If I understand correctly though, actually using them in debug mode would double the number of JavaScript ResourceLoader modules in the present ResourceLoader architecture unless we had a way to flip between module files.

You can already specify files that are loaded only in debug mode using debugScripts (see VisualEditor for an example use). You'd need to add another option specifying files to load instead of normal scripts (rather than in addition to them), but that should be a fairly easy change somewhere in ResourceLoaderFileModule.

@matmarex, that's amazing. Let's continue any further discussion in T218410.

Everyone else, sorry for the diversion!

To get back on topic, when ResourceLoader was setup, one of [[ https://www.mediawiki.org/wiki/ResourceLoader/Requirements | the requirements ]]was "Support direct resource inclusion". While we should keep supporting this, essentially what I'm asking for is to "support inclusion of resources that are derived from tools". To do that, I'm keen to understand what needs to happen for that to be possible, for example:

  • security concerns
  • integrating with vagrant/deploy pipeline
  • validation checks
  • whether those tools can access other resources on the web e.g. npm).

Discussions on minification seem specific to the tools that would be used by such a step and maybe another RFC (around tooling standardisation). I can follow up with that once this particular problem has been discussed.

WMDE-leszek added a subscriber: WMDE-leszek.