Page MenuHomePhabricator

RFC: Add a frontend build step to skins/extensions to our deploy process
Closed, DeclinedPublic

Assigned To
None
Authored By
Jdlrobson
Jul 6 2018, 9:56 PM
Referenced Files
F31747635: Screenshot 2020-04-09 at 23.33.52.png
Apr 9 2020, 11:46 PM
F31747633: deps webpack_4.42.1.png
Apr 9 2020, 11:46 PM
F28389211: Screenshot from 2019-03-14 15-21-10.png
Mar 14 2019, 9:25 PM
Tokens
"Like" token, awarded by Sgs."Like" token, awarded by Akuckartz."Love" token, awarded by Kenrick95."Love" token, awarded by Demian."Love" token, awarded by Ricordisamoa."Like" token, awarded by WMDE-leszek."100" token, awarded by nray."Love" token, awarded by Niedzielski."Like" token, awarded by Jhernandez."Yellow Medal" token, awarded by pmiazga.

Description

The following intertwined discussions were descoped from this ticket:

  1. T257072: Determine Node package auditing workflows to systematically audit every version of every dependency (end result: libraryupgrader2)
  2. T257061: Evaluate the workflows and workload of managing a Node package repository - the repository storing the audited package versions
  3. T257068: Draft: RFC: Evaluate alternative Node package managers for improved package security - The package manager distributing packages from the repository to CI and developer nodes

Problem

Currently when deploying MediaWiki code at WMF, Scap creates a single directory with the contents of MW core and all all installed skins/extensions, stripped of all the git metadata and we blindly rsync it to the application servers.

There is not currently a process in place somewhere between approving a commit in Gerrit and it being live on WMF application servers, that would allow for an extension or skin to build derivative files or inject additional files from third-party sources (e.g. npm packages).

The absence of such a build step, means skins and extensions must workaround this by ensuring all code needed in production is committed to the Git repository.

Many tasks developers perform are monotonous and can be automated. However, these automation tasks currently cannot happen as part of the deployment process. Instead, they are run locally or by bots, with their output always being committed in the same repository. This can be confusing to contributes, prone to human error and demoralising for developers as it regularly leads to merge conflicts (sometimes on every single commit) .

As long as we do not have a build step in our deploy process, we are:

  • are discouraged from making use of existing tooling without modification that could lead to potential optimisation in our code but would change the resulting code on every commit
    • e.g. better minification, dead code elimination, Node.js templating libraries, transpilers, SVG minification (see workarounds)
  • exposing ourselves to avoidable errors at all times (see examples below)
  • unable to move fast.
    • are forced to commit assets to master and experience merge conflicts with every patchset
    • forced to run scripts manually that are slow and could be delayed till deploy-time (e.g. minification of image assets)
  • Wasting time reinventing the wheel in PHP/build processes (see ResourceLoaderImageModule and Discovery, Gruntfile tasks, deploy-bot for examples - see workarounds)

In summary, these build and deploy steps are already occurring in a de facto, informal manner on individual developer's local machines. This is not ideal from the security or developer performance perspectives. This RFC is about formalizing these methods and automating the process where possible.

Background

Ten years ago, build steps were rare, in particular in frontend development. Frontend development involved managing a small amount of assets providing modest enhancements to a page. Over time, JavaScript has become a more essential part of websites, MediaWiki itself makes much use of JavaScript.

The most common use case for a build step ten years ago, was to concatenate files, one of the requirements of that the original ResourceLoader was built for and does very well. However, as time has passed, build steps have been used for module loading (see http://requirejs.org/) and more recently performance optimisations.

Build processes are now commonly used to do all sorts of task - such as optimise/minify SVG assets; handling support for older browsers; making use of foreign dependencies via a package management system; dead code elimination and minification [1]

Every frontend engineer is expected to understand and make use of build processes and in fact it is considered a red flag if one doesn’t.

Frontend development inside MediaWiki proves far more difficult and can be a frustrating affair. The typical learning curve is 3 months for a well-experienced frontend developer to get up and running in our ecosystem given the unfamiliarity of our tooling and code choices.

Since the introduction of composer in MediaWiki, our code has felt much more manageable and shareable with the outside open source world. Many build steps have been rendered unnecessary e.g. Wikidata, but no frontend equivalent exists

Since we generalised our CI to run npm test we empowered developers to have more flexibility in what and how we test.

Similarly, by adding a job to deploys and giving power to developers to control the way code that gets deployed, I believe, we will see similar innovation.

[1] Although ResourceLoader provides minification, the JS implementations of minification tend to result in more bytes savings than our PHP implementation.

Why not having this is a problem

The lack of support for a build step in MediaWiki has led to many imperfect workarounds and clearly broken experiences across our codebase that are in dire need of fixing.

Some examples:

  • The Popups extension makes use of webpack to build and test complex JavaScript code. It has a test script that forces the developer to run a build step locally and commit the output. This leads to merge conflicts on every single patch in the repository, causing engineers to make tedious and mechanical corrections.
    • Despite this pain point, this build process was considered one of the main reasons we successfully, shipped this software by providing us space to move confidently and quickly.
    • MobileFrontend and Minerva skin(which power our mobile site) will soon follow the example of Popups and hit the exact same problems.
    • Challenges with Popups are documented in this phabricator task: https://phabricator.wikimedia.org/T158980
  • The Kartographer maintains a local copy of Leaflet, which was incorrectly edited by engineers unfamiliar with the code layout, and then overridden when source files were built via the build script. This has recently been corrected via a grunt test script run on every commit.
  • Flow which makes use of a build script to build server side templates. Developers modifying Handlebars templates must commit compiled templates. It is common for these not to be committed and for invalid code to be deployed to our cluster.
  • The Wikimedia portal uses a build step. Currently, this is taken care of by a bot which deploys built assets prior to deployment. In between these runs, the deploy repo lives in an outdated state.
  • Various repos make use of SVGO to compress SVG assets. See https://phabricator.wikimedia.org/T185596. This is enforced via an npm job, but could easily be handled by a build step, avoiding the need for a human to run npm install and run it themselves.
  • In MediaWiki core external libraries are copied and pasted into the resources/lib folder. This is done manually (https://github.com/wikimedia/mediawiki/commit/065b21b4bdd49ffca494ff905de29bfc500ed535). These files do not actively get updated, but if they did, a build step could help them live elsewhere.
  • The Wikidata Query Service GUI has a build process that compiles dependencies needed to display complex graphs and maps.

From what I can see there are various other (undocumented?) build processes in the ecosystem. If you are familiar with these and how a build process might help you, feel free to update these and move them into the list above.

  • Wikibase had(s) a complicated build process which was moved to composer that I am unable to understand.
  • VisualEditor uses a build step.
  • OOjs UI uses a build step.
  • LinkedWiki has a build step and doesn't check in its assets meaning it is not compatible with CI (see T198919)
  • Wikistats2 has a build process and must build and release static assets in additional commits to master example 1 example 2.

Workarounds

Right now our workaround appears to be “rebuild in php” or “force developers to run build steps locally and enforce with Jenkins”. In the case of managing SVG assets, we created the ResourceLoaderImage module and in the case of LESS compiling we introduced a PHP less compiler. In the case of compiling templates we use LightnCandy (but for the client, due to lack of a better solution and lack of a build step, offload compiling unnecessarily to our users).

While this approach is fine, when libraries already exist, this is not always practical. In the case of the LESS compiler, we use an outdated LESS compiler which has already needed to be replaced and there is more risk that these projects become unmaintained given many of these projects are moving to the Node.js ecosystem.

Enforcing with Jenkins from experience, appears to be difficult to do at scale. In the case of SVG minification various solutions exist across different repositories and standardisation and generalisation has been tricky and become blocked due to an inexperience of the many frontend engineers building these tools with our CI infrastructure. The approach with SVG minification involves adding a check for whether SVGs are minified and committed rather than simply running the optimisation on existing code in the repo.

All workarounds create unnecessary slowdown and takes focus away from actually building.

Motivation

  • Having a build step for Node.js would allow us to
  • Offload optimisations from the developer to automation
  • Simplify software development by empowering use of JavaScript tooling such as rollup, webpack, optimising scripts (such as svg minifiers, UglifyJS or babel minify), and up to date tools like the canonical LESS compiler.
  • Remove various “check” scripts in place in a variety of extensions which would be unnecessary
  • Avoid the need for committing assets into extensions which are built from other components within the extension making our code more readable. It is often difficult to distinguish built assets from manually written assets and README files are only somewhat effective.
  • Make it easier for newcomers to understand the code in our projects and participate in development
  • Where templates are used in our projects, compile templates as part of a build step, rather than offloading this task to the user
  • Empower developers with tools to create more efficient and better frontend interfaces by giving them the same kind of tooling and flexibility that the PHP backend leverages

Requirements

  • Build step should allow the running of an npm command prior to deploy.
  • Build step should be discoverable and consistent in every repo:

For instance I should be able to expect a build process (if available) to live inside npm run build, npm run deploy, npm run postinstall

  • Build steps need to be run locally for every single extension and skin that have registered one.
  • Build step must run on:
  • Any CI job - currently Selenium, QUnit will not run any build steps and go back what code is deployed in the repo. While the latter can be worked around using a post-update-cmd inside composer, this is not run by the former.
  • Before a deployment we would need to run this to ensure the correct code is deployed to production
  • Before a SWAT deploy on repos that have a build process we’d need to re-run the build step as content may have changed
  • Before bundling a release we would need to run this to ensure that 3rd parties are not required to install dependencies and run build processes.
  • Vagrant instances would need to be aware of any install steps as part of vagrant git-update.
  • Any errors within a build step should be made visible to whatever runs it. Build steps are expected to use error codes.

Note: For sysadmins supporting 3rd party wikis using git, given users can be expected to run php maintenance/update.php as part of many installs, it seems reasonable to expect they can also run npm install && npm run build

FAQ

I’m a sysadmin not a coder. I don’t want a build step as this makes development harder.
If you are using MediaWiki’s tar releases, you will not notice anything different. When releases are bundled, we’d do the build step for you.

Won’t https://phabricator.wikimedia.org/T133462 fix this?
This will be extremely useful and provide a more standard way to load files, but it will not allow anything further than that. This will provide greater freedom into how users build code, but it will not allow things like transpiling TypeScript or transpiling modern JavaScript that would be enabled by allowing the greater freedom of a build step.

Can’t we just recreate the tooling you are using in PHP?
For some of it, yes. In fact, we added some extremely powerful and useful PHP tooling for svg handling inside MediaWiki ResourceLoader. This is however costy as it requires a frontend developer to convince a backend developer to implement said tooling, and there are usually options readily available and more well maintained in the npm ecosystem e.g. https://www.npmjs.com/package/svg-transform-loader. Building this out took time.

While possible, something like transpiling ES6 JavaScript to ES5 would be a little crazy to implement in PHP but is available “off the shelf” in the npm ecosystem.

Introducing a build step introduces more that can go wrong in our pipeline. What if NPM is down?

Sure, there will definitely be things to work out here and they will be taken care of on a case by case basis. We should be reassured that npm is more resilient these days, but this is not a problem unique to our ecosystem. A build step could be written to fetch a recent stable release of code from another repository if necessary.

I don't want to introduce foreign code from 3rd parties via npm
This is happening already in skins/extensions. Having a central build step won't change this.

What if build scripts pull down code malicious packages from npm modules?
We'll review https://eslint.org/blog/2018/07/postmortem-for-malicious-package-publishes but to begin with (particularly with regard to package-lock (T179229). We will need some way to manage this, while still getting the benefit of automation, for instance we could limit outgoing HTTP requests and run our own private mirror of package managers we need.

I don't want to run npm jobs on every git checkout!
It's unlikely you will need to. If vagrant is being used and we had a a standard approach, vagrant would be automated to take care of this for you. Built assets although maybe outdated are likely to be usable unless incompatible changes have been made in PHP. If you are working on JavaScript code that uses a build step, then you would be using that same build step for development.
One possible compromise, to mitigate the pain here could be limiting build steps to skins and extensions (ie. not core), so that if people want to avoid a build step altogether they are not forced to do so.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

In the light of T241180: RFC: Adopt a modern JavaScript framework for use with MediaWiki it would be of great value for adoption of a modern JavaScript tool chain in a wider sense, and to leverage synergies between teams currently maintaining individual attempts to a solutions, to keep exploring this endeavor.

A marginalia with respect to the overall topic of the RFC, but maybe worth clarifying:
In the IRC meeting it is mentioned that "you can't effectively pin npm dependencies" and while "[t]his is what package-lock.json is for" is duly replied, the fear of "upstream can just silently replace a version with a different file" is only countered by "npm will freakout with fatals if the hashes", suggesting that intervention will be needed on the consuming side, while in practice "you cannot reuse a published version by unpublishing and re-publishing it".

I think a good next step here would be to start narrowing in on specific use cases.

There's a good chance that it will not be feasible in the short-term to have in WMF production an arbitrary build step where we just download any unaudited dev package from npm as specified in a git repo, and serve the output from executing that code straight to real users's web browsers. (Lock files or perfectly secure containers only help to aleviate some of the concerns about what happens during the build step, it doesn't change the problem with the input or output itself.)

The task as written enumerates a number of possible problems that could be solved if we go with a specific solution, which is understandable, but also makes it unclear what is important and what is nice-to-have E.g. so that other stakeholders can assess what you need, and can think along with you within the constraints they might bring to the table.

It's also important I think to balance these in terms of what they unblock or make easier. These may be obvious to frontend developers, but not to everyone. E.g. if SVGO is on the important list one could say that it we already get its performance improvemt by running it manually, but running it closer to deployment would mean we'd save development time by not having to learn about and remember to run and re-run scripts locally development, and won't have to maintain a complex CI pipeline that confirms that it was run prior to merging in Gerrit.

If ES6 syntax is important, I'm curious to know if this enables new user experiences that would otherwise be impractical in ES5, or if its something else we'd gain. Also worth knowing whether it's just ES6 or if recent edition's features are considered essential as well. For example, if we'd audit Babel or another transpiler, what's the minimum fixed configuration it could be deployed with?

Side note: While not affecting the stated security challenge therein, it would possible to mitigate or even solve the operational challenge of downloading and installing libraries from npm by switching to yarn and using yarn 2's zero install functionality.

If ES6 syntax is important, I'm curious to know if this enables new user experiences that would otherwise be impractical in ES5, or if its something else we'd gain. Also worth knowing whether it's just ES6 or if recent edition's features are considered essential as well. For example, if we'd audit Babel or another transpiler, what's the minimum fixed configuration it could be deployed with?

So, first of all, ES6 is not the only feature that a build step (at any shape) would enable, also TypeScript would be possible with build step. Both would be more useful in developer velocity (leading to more UX improvements with the limited engineering resources we have) plus easier on-boarding and actually finding good engineers to hire. TypeScript also helps with debugging and catching logical errors in the codes using type checks.

Another point I feel is important here: Better minification, our current minification is on-fly and because it needs to be fast, it doesn't minify as well as something like UglifyJS (T49437#1782357). 10% improvements is not much but multiply it by the total size of assets we are serving and you get a pretty large number.

I understand the security concerns, Is it possible to have a dedicated npm registry and how much work is to maintain? We already maintain this kind of registry for docker, apt, and composer (through vendor repo). (I probably missed some more). This can be to some degree automated and have some criteria for it, like only libraries that pass basic automated security scans, only stable releases, releases without open critical security issues, and so on. The Michael's idea sounds also good to me.

Another point I feel is important here: Better minification, our current minification is on-fly and because it needs to be fast, it doesn't minify as well as something like UglifyJS […]

I won't address minification here. See T49437#6044157 for why.

I understand the security concerns, Is it possible to have a dedicated npm registry […]? We already maintain this kind of registry for docker, apt, and composer (through vendor). […]

Yes. Auditing software and deploying the audited software could be viable direction for the build step. Definitely worth considering, I think. In terms of effort, it seems feasible to do this auditing for software that applies discipline around their use of dependencies and who (ideally) themselves have a sense what packages they're depending on.

Unfortunately, while this discipline is the norm in the Composer/PHP, Python, and Debian/Ubuntu ecosystems, it is not so common in the Node.js eco system. Case in point:

webpack@4 dependency grapheslint@6 deps
341 packages132 packages
deps webpack_4.42.1.png (1×2 px, 703 KB)
Screenshot 2020-04-09 at 23.33.52.png (1×2 px, 407 KB)

This lack of discipline is never not inconvenient (install time, run time, dependency churn, etc.). But, it is fairly tolerable when all you need is the result of a lint or test. For that, one can "just" run it in isolation without network access, and limit your disk exposure to the source code it operates on (I wrote a little something about why and how). But, when you're talking about making it produce source code to commit back to the repo, or deploy in production, or run on your users' devices – that's a different game entirely.

Fortunately, it's not all bad!

Firstly, much of the Node.js community revolves around developer tooling and when that code is never needed in production, then this inconvenience doesn't have to make you or your users vulnerable. Instead, the inconvenience can be limited to merely making the developer environment slow to install and hard to debug. Such execution is easily isolated. (But, do contributors with some kind of production use isolation everytime they run npm test?)

Secondly, there is also still a lot of good software out there. It might not be the most popular in a given category, but that shouldn't matter.

Side note: While not affecting the stated security challenge therein, it would possible to mitigate or even solve the operational challenge of downloading and installing libraries from npm by switching to yarn and using yarn 2's zero install functionality.

A bit more detail: Yarn 2 is designed so the cache can be committed into git and used as a package repository. Very little documentation on that yet.
Yarn 1 did the same with offline cache (article).
The difference: migration guide.
Also, the Yarn 2 compatibility table is of interest.

For completeness, there's also pnpm, which is easier to use than Yarn 2 (just use in place of npm), but it's a one-person project, unlikely to be used by Wikimedia.

So the plan to move forward with this is for @JoeWalsh to collect stakeholder needs so the RFC becomes a specific implementation proposal. It is understood that performance and security are two key aspects of this and need to be considered in whatever is proposed.

@dcipoletti will be the point of contact for the specific implementation proposal moving forward. Requirements from stakeholders and his team's Vue.js work on search will inform the proposal.

dcipoletti changed the task status from Open to Stalled.Jun 26 2020, 3:00 PM

Thank you so much for openly sharing your collective perspectives within the Vue.js Development Workflow RFC [https://phabricator.wikimedia.org/T199004]. It provided a plethora of wonderful discussions and thoroughly spoke to the level of passion and importance we place on everything we do here at the Foundation every day!

Due to the complex discourse expressed within the RFC, there was a clear need to conduct a listening tour during the month of May. During these sessions, we had the opportunity to deep dive and hear the common frustrations with our front-end development environment and thus, collect crucial considerations deserving of due experimentation to address. Over the course of the tour we slowly came to the realization of something much larger: In order to gather the right learnings, holistic perspectives and an approach that ensures a clean and progressive developer experience, we needed to actually put code to screen and iterate on these ideas before conversations creep into the realm of conjecture.

This is why the Vue.js Development Workflow RFC will be marked as Stalled. We now must allow the Web Team to experiment on these approaches through the Search Component case study they will be owning. So I would like to take this opportunity to inform you of the practical next steps being taken:

  1. In order to unblock the major discovery work that is critically needed on the Search Component case study, the Web Team, in collaboration with members originating from teams across the Foundation, will begin prudently exploring solutions that fit the requirements and scope of the study.
  1. In line with the nature of case studies, the Web team will be making autonomous technical decisions that contribute to (a) the case study’s larger goal of creating solutions that meet the technical needs of the team and (b) meeting the product acceptance criteria. Regardless, the Web Team will be strongly committed to transparency and will seek the advice of the Foundation’s broader engineering teams while exploring tradeoffs raised during the RFC phase, as needed. Furthermore, the team is committed to avoiding non-payment of technical debt if its localized approach does not become the canonical approach.
  1. During the case study, regular communication will be provided that will inform the broader Foundation of the status of the case study, learnings, pitfalls and key discoveries made in the Web Team Status Log [https://www.mediawiki.org/wiki/Reading/Web/Desktop_Improvements/Vue.js_case_study/Status_log]. A Learnings and Results Capstone will be provided when the project has reached an accepted functional state that is ready for review.
  1. Now that the initial phase of the RFC has concluded, the next formation of committees and/or broader discussions of the results are best suited to occur following the completion of a component workflow established by the Web Team. Ample time will be provided to address concerns and hold discussions before making a decision on a prospective Search Component release.
  1. A reminder: All work that is being performed will undergo security and performance reviews and procedures as usual. Although functionality and approach may be significantly different, the case study must abide by all standard Foundation release criteria protocols and best practices.

Henceforth, we will be seeking to achieve an excellent Wikipedia search experience for our customers around the globe while also taking the unique opportunity to discover, assemble and prospectively recommend a sound development workflow for the incredible engineering folks here at the Foundation. We give thanks to the many, many participants on and off the RFC who have contributed to the discussion the past couple years and especially those have invested in technical explorations that furthered the art the years prior. Looking forward to collaboratively creating a better and brighter future and finding the magic that will make every keystroke that much more delightful!

Thank you all! Stay tuned :)

Daniel

@dcipoletti will be the point of contact for the specific implementation proposal moving forward. Requirements from stakeholders and his team's Vue.js work on search will inform the proposal.

Is the Vue.js work on search project planning to use a build step and not use the approach outlined at https://www.mediawiki.org/wiki/Vue.js by shipping the Vue runtime using RL to browser?

Docs outline both approaches. WVUI will use runtime-only and will use build step.

Docs outline both approaches. WVUI will use runtime-only and will use build step.

Sorry, this sentence is confusing to me.

Since this ticket is about exploring a frontend build step, and the comment from Daniel says that exploration will be based on Vuejs Search enhancement, I am assuming Vuejs search experience project will be based on a build step and not by shipping Vue runtime to browser(no build step). Please correct me if I am wrong.

@santhosh, only the runtime will be shipped for search. That's tracked in T252348. The WVUI component library will be compiled (built) and expect this runtime to be provided. (It will also function fine if the compiler is present but it doesn't require that additional overhead.)

It seems to me that the discussion on this RFC has been made difficult by of the wide range of use cases and possible solution covered by the umbrella of a "build step". For instance, the idea of pre-compiling vue templates as described above by @Niedzielski seems relatively uncontroversial. On the other hand, automatically downloading npm packages along with their dependencies seems to be much more problematic. It is unclear if the experiment @dcipoletti describes would involve automatic deployment of unaudited third party code.

With this RFC marked as stalled, when and where would be the time and place to discuss strategies to mitigate security concerns of third party npm libraries? When and where should we discuss how a build step would be integrated with the cherryl-pick/revert workflows that we use to address production errors?

Hello @daniel ! Thank you for bringing the package manager aspect of the conversation back into proper light as it is indeed a core focus of the Search Component case study. You are certainly correct in the assessment that an approach allowing unaudited packages and their dependencies to be automatically downloaded would be problematic. This understanding is certainly shared and to be clear - there are no circumstances where we would be entertaining an insecure, unaudited approach.

The formal investigation around a package manager solution is taking proper shape as we will not be moving forward without a resolution. The only possibilities being explored are those that include approaches such as hosting a repository of Foundation-audited packages (and those in their dependency tree) that have gone through a full round of security review. In parallel we are also discussing the exploration of sandboxed environments with pinned versions of packages as well.

Conversations of this initiative are assuredly forthcoming. Nothing will be performed in a closed bubble. Communication and collaboration is key to the case study’s credo and its findings. There is actually quite a bit of setup and due diligence on the discovery side that must go into these solutions first so that the discussions occurring here on planet T199004, thereafter, are given with an informed perspective. This can only be done if the various teams involved are given the proper time to evaluate the aforementioned solutions and gather a few key results. Deeper discussions and guidance from the Security Team will be taking place in the upcoming weeks. Actually very exciting steps to ensure a secure approach! The output of those conversations will be shared.

As for the cherry-pick/revert workflow, @Niedzielski will be digging into the finer details with RelEng to construct a proper solution to not break expected workflow. A few experiments would do a lot of holistic good for the Foundation and bring valuable learnings to the table. Focussing on the results with all “worked”/“yikes, didn’t work” realizations at-hand will be of great worth to the progression of this RFC. I do though encourage everyone to continue asking questions and bringing up concerns as they come along.

So as you can see, the need to dig into the concerns expressed thus far was indeed a factor in the overall RFC being Stalled. I do see this bit of breathing room to properly approach the above discussions as highly beneficial for us all. On the most positive note, you can definitely look forward to the discussions being much richer in context with a copious amount of amazing learnings within the upcoming weeks :) Thank you all!

With this RFC marked as stalled, when and where would be the time and place to discuss strategies to mitigate security concerns of third party npm libraries?

As a starter I've given a brief introduction to Yarn 2 and Pnpm package managers and their impact in my email to wikitech-l and T199004#6047719.
These PMs solve the issue of controlling what packages and versions are deployed with as simple a solution as a git repository used as a package store. With that it's only a matter of adapting libraryupgrader2 to track our dependency trees in full depth and keep the package store up-to-date with audited packages.

Thank you for the detailed response, @dcipoletti! I'm looking forward to learning more once you have progressed in your experimentation.

As to maintaining a git repo with the fully expanded dependency tree, as @Demian describes: we are doing that for composer dependencies in the mediawiki-vendor repository. It works, but the process of keeping that in sync with mediawiki core is unfortunately cumbersome. The RFC from 2014 about this can be found at https://www.mediawiki.org/wiki/Requests_for_comment/Composer_managed_libraries_for_use_on_WMF_cluster.

The formal investigation around a package manager solution is taking proper shape as we will not be moving forward without a resolution. The only possibilities being explored are those that include approaches such as hosting a repository of Foundation-audited packages (and those in their dependency tree) that have gone through a full round of security review. In parallel we are also discussing the exploration of sandboxed environments with pinned versions of packages as well.

Conversations of this initiative are assuredly forthcoming.

@dcipoletti Based on my observations using these tools for development I've started drafting an
RfC: Evaluate alternative Node package managers for improved package security.

I believe the RfC would be conducted on T257068. Please see if this is a good format for an RfC, your comments are welcome.

NOTE: (for everybody) This is a DRAFT, please discuss contents of the RfC on the wiki talkpage.

To clarify the scope: I see 3 separate topics to discuss. I've created tasks to untangle the discussions in this umbrella ticket. In the sequence of data flow:

  1. T257072: Determine Node package auditing workflows to systematically audit every version of every dependency (end result: libraryupgrader2)
  2. T257061: Evaluate the workflows and workload of managing a Node package repository - the repository storing the audited package versions
  3. T257068: Draft: RFC: Evaluate alternative Node package managers for improved package security - The package manager distributing packages from the repository to CI and developer nodes
kostajh subscribed.

I'm closing this in favor of T279108: Introduce a Front-end Build Step for MediaWiki Skins and Extensions which is more or less the same proposal, but with the tech-decision-forum instead of TechCom. (If someone disagrees and feels that this task should be re-opened, feel free to do so, but I am not sure why we'd want to do that.)