Page MenuHomePhabricator

Decide whether we want the package-lock.json to commit or ignore
Open, Needs TriagePublic

Description

npm version 5 adds a package-lock.json [1] to each repository. Do we want to have this committed or ignored?

I would suggest not to commit the file, because composer.lock is also not committed. This file has the same effect and should handled the same.

There are two extensions at the moment having this file committed:

AdvancedSearch - I0f591d75d0a7b8755b446398e753507b36db15b1
WikibaseLexeme - Icc91a224e264c9b33df70254958b0af95a463d03

A decision before many developer using npm5 and committing this would be nice to have a specification how to handle this.
In case of ignore, mediawiki core, all extensions and all skins with node_module in .gitignore should get a line added to its .gitignore. Committed files should be removed.

[1] https://docs.npmjs.com/files/package-lock.json

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 28 2017, 9:50 AM

Just from my experience: The npm team clearly states (in their documentation) that the file "is intended to be committed into source repositories". However, for a customer I worked for, commiting the package-lock.json file to git created several problems with their CI infrastructure (which had no direct internet access, though, just through a proxy). This could be a very specific problem, which does not apply to Wikimedia infrastructure, however, I really don't see a benefit for us when commiting the file (like I do not see a benefit with composer's lock), as the CI should always use the newest compatible version as specified in package.json when testing a commit. Package-lock.json prevents it (that's the purpose).

So my opinion is: Ignore this file :)

If we can use a lock file to pin versions instead of hardcoding in package.json that might be nice. But what happens if a developer is using an older version of npm like v3 or v4? Will that file be respected?

Aklapper renamed this task from Decide wether we want the package-lock.json to commit or ignore to Decide whether we want the package-lock.json to commit or ignore.Oct 29 2017, 10:22 AM

If we can use a lock file to pin versions instead of hardcoding in package.json that might be nice. But what happens if a developer is using an older version of npm like v3 or v4? Will that file be respected?

Nope. But the next npm5-using dev that comes along will notice it immediately.

Of course, if we upgrade CI to use npm5, master will never be broken, which also works.

brion added a subscriber: brion.Jan 16 2018, 10:02 PM

The issue with OS-specific dependencies seems to be a potential dealbreaker? Rumor on https://github.com/npm/npm/issues/17722 is that something in 5.6.0 is supposed to fix it but that it may not fix it in all cases.

Most common OS-specific dep will probably be 'fsevents' for things like grunt that have tools that watch the filesystem. So this may or may not actually be an issue for things that don't need dev deps....

There's now a non-trivial reason to prefer having these in git, albeit blocked by switching CI to use npm 5.7+: http://blog.npmjs.org/post/171556855892/introducing-npm-ci-for-faster-more-reliable

Jhernandez added a comment.EditedMar 7 2018, 6:43 PM

I think we should have them committed as npm recommends, as they give you full reproducibility in CI and in other co-workers machines without resorting to manual version pinning.

Right now people resort to pinning versions in package.json which makes npm update useless. Instead, we should use the provided mechanism in the lock file so that we could rely on npm update and inspecting the diff changes in package-lock.json to see what's changed or being upgraded.

We've recently filed T189805 to discuss the same question for node.js services, but I've merged it in here as a duplicate as we have to come up with a strategy common to both MediaWiki and services.

We've recently filed T189805 to discuss the same question for node.js services, but I've merged it in here as a duplicate as we have to come up with a strategy common to both MediaWiki and services.

I disagree - there's no reason both policies have to be the same. Extensions and services JS policies differ in a lot of ways, actually - style being one of them. Extensions use tabs, services spaces, etc.

I do think services should commit package-lock.json but I don't have strong feelings about extensions. Some pros:

Security issues. We've started adding nsp to Jenkins which has voting privileges. nsp is crippled by the lack of package-lock.json - I currently have packages which pass jenkins *because* there is no package-lock.json committed. These packages fail the nsp check locally where I have a local version if package-lock.json This is MORE CRITICAL for services because this means there may be packages being used in production that have Security issues, but are erroneously being passed by nsp because of the missing package-lock.json. I've filed a ticket with nsp here: https://github.com/nodesecurity/nsp/issues/243 (no response yet) but have not debugged the issue extensively so I'm not entirely sure why. But this is less critical for extensions, where npm and package-lock.json is mostly being used locally for linting purposes and so don't actually make the extensions vulnerable. It's possible that nsp will be fixed to generate all possible trees without the use of package-lock.json but I'm sceptical.

Standards compliance. In services some of our packages are npm packages that are being used by others. Node is recommending that the file be committed. With extensions, these are all being used by mediawiki, so these packages aren't going to be used in different projects, so it's less of an issue. I have already received a pull request by a third party user from one of our packages to add package-lock.json: https://github.com/wikimedia/html-metadata/pull/72. I know "everyone else is doing it" is not the most convincing argument but there you go :).

And kind of a mixed arg:

npm install behaviour
So as of npm ^5.1.x, package.json trumps package-lock.json. Unlike with composer.lock, a repository with a locked package file doesn't change the behaviour of npm install. npm install will install updates with ^ repos, which overwrites the old package-lock.json which then has to be re-committed. So this addresses some of the concerns that we'll have to manually update packages; this is not the case with top-level packages. ✔️

But, I think what it does do is change the dependency tree for *other* packages using the lock file that are included. (@Pchelolo I think I slightly misled you in the other thread... so for preq, which is mostly used as a dependency by other packages, I *think* it would use the locked version of the package when say being used for example by service-runner; so the lock file would have to be periodically re-generated.) https://docs.npmjs.com/files/package-locks#using-locked-packages. As committing package-lock.json becomes more common though that's going to be the case for a lot of our dependencies anyway. But I guess this is partially an argument against using them in our dependencies at least. ❌

The problem with native dependencies seems a pretty strong argument to me to git-ignore it for the time being, and revisit the decision at a later point.

Security issues. We've started adding nsp to Jenkins which has voting privileges. nsp is crippled by the lack of package-lock.json - I currently have packages which pass jenkins *because* there is no package-lock.json committed. These packages fail the nsp check locally where I have a local version if package-lock.json This is MORE CRITICAL for services because this means there may be packages being used in production that have Security issues, but are erroneously being passed by nsp because of the missing package-lock.json. I've filed a ticket with nsp here: https://github.com/nodesecurity/nsp/issues/243 (no response yet) but have not debugged the issue extensively so I'm not entirely sure why. But this is less critical for extensions, where npm and package-lock.json is mostly being used locally for linting purposes and so don't actually make the extensions vulnerable. It's possible that nsp will be fixed to generate all possible trees without the use of package-lock.json but I'm sceptical.

I honestly fail to understand why having package-lock.json changes the security perspective. The check is run upon creating the commit, and taking the dependencies at that point in time happens with or without that file. It seems to me like that's an NSP design flaw rather than a security issue.

Standards compliance. In services some of our packages are npm packages that are being used by others. Node is recommending that the file be committed. With extensions, these are all being used by mediawiki, so these packages aren't going to be used in different projects, so it's less of an issue. I have already received a pull request by a third party user from one of our packages to add package-lock.json: https://github.com/wikimedia/html-metadata/pull/72. I know "everyone else is doing it" is not the most convincing argument but there you go :).

npm install continues to behave the same. Moreover, if packages that include our packages have the file, then they will have locked dependencies either way.

And kind of a mixed arg:

npm install behaviour
So as of npm ^5.1.x, package.json trumps package-lock.json. Unlike with composer.lock, a repository with a locked package file doesn't change the behaviour of npm install. npm install will install updates with ^ repos, which overwrites the old package-lock.json which then has to be re-committed. So this addresses some of the concerns that we'll have to manually update packages; this is not the case with top-level packages. ✔️

Heh, to me this is an argument contra the file, since source repo commits will then start containing the (unrelated) changes to package-lock.json. Now, whether we should start using the file in the deploy repos, that's a different question/issue that should be discussed separately.

But, I think what it does do is change the dependency tree for *other* packages using the lock file that are included. (@Pchelolo I think I slightly misled you in the other thread... so for preq, which is mostly used as a dependency by other packages, I *think* it would use the locked version of the package when say being used for example by service-runner; so the lock file would have to be periodically re-generated.) https://docs.npmjs.com/files/package-locks#using-locked-packages. As committing package-lock.json becomes more common though that's going to be the case for a lot of our dependencies anyway. But I guess this is partially an argument against using them in our dependencies at least. ❌

Exactly. Using locked deps on each level quickly spirals into having many versions of the same package.

Exactly. Using locked deps on each level quickly spirals into having many versions of the same package.

Just to be clear, the recommendation as far as I've read it is to commit the lock file into applications (extension, service, mediawiki, skin) and never commit it into libraries to avoid nullifying the flattening and library sharing behavior of npm.

From the recent attack on eslint packages:

Postmortem: https://eslint.org/blog/2018/07/postmortem-for-malicious-package-publishes

Recommendations
With the hindsight of this incident, we have a few recommendations for npm package maintainers and users in the future:

  • Package maintainers and users should avoid reusing the same password across multiple different sites. A password manager like 1Password or LastPass can help with this.
  • Package maintainers should enable npm two-factor authentication. npm has a guide here.
  • Package maintainers should audit and limit the number of people who have access to publish on npm.
  • Package maintainers should be careful with using any services that auto-merge dependency upgrades.
  • Application developers should use a lockfile (package-lock.json or yarn.lock) to prevent the auto-install of new packages.

And devs and CI should use npm versions that respect the lock files (5+)

This doesn't really apply to our services in production since the only way that newer package versions can get it is by manually triggerring a rebuild of the dependencies (or bumping them in package.json) by the service owner/deployer.

cscott added a subscriber: cscott.EditedJul 13 2018, 3:21 PM

FWIW, Parsoid commits the package-lock.json in our repo. With it, we were able to quickly verify that we never deployed a bad eslint-related package in the last compromise, nor would any of our developers inadvertently have done so if they happened to npm install to work on Parsoid at a vulnerable time. I think that's a compelling argument for continuing to use lock files for any "top-level" (deployable) WMF projects. I'd include extensions, services, and standalone apps in this.

Informally, we've found that it is useful to be able to audit package upgrades in terms of source-level changes to the package-lock, which then go through our usual gerrit review process. On the other hand, we have had great difficulty with older npm versions in regenerating the lock files without dirty diffs. Precisely matching the version of npm seems to be required, and even so unrelated changes creep in. I am cautiously optimistic that using more recent versions of npm help with this issue, but exact line-for-line reproducibility has never seemed to be a fundamental goal of the npm team.

On the other hand, we also maintain a number of npm libraries. Members of our team have commit (and perhaps publish) access to: domino, a fork of pegjs, babybird, prfun, core-js, mediawiki-title, pn, service-runner, eslint-config-wikimedia, jsdoc-wmf-theme, and probably a few others. I think the status of lock files is less clear here -- and the danger of the eslint compromise is illustrated by the fact that (a) many of these use eslint in dev scripts, and (b) and (to quote man npm shrinkwrap): "Since "npm shrinkwrap" is intended to lock down your dependencies for production use, devDependencies will not be included unless you explicitly set the --dev flag when you run npm shrinkwrap."

I think a careful audit is most needed for the ancillary packages. FWIW, I released a new version of domino yesterday, and I chose to include a package-lock.json file in the release. The previous release also had a package-lock.json file---it might be the new default for npm publish, and using a recent version of npm is necessary if you have 2FA enabled (which I hope all of us do!).

EDIT: Note that npm-shrinkwrap.json and package-lock.json appear to be the same thing roughly equivalent (Parsoid actually uses the former, and domino the latter). I dunno what the rationale was for renaming the file, although I assume it is to coexist nicer with non-npm package managers. EDIT3: see @Jhernandez's comment below for diffs.

EDIT2: I implied this, but it's worth being explicit: I am strongly encouraging the use of the --dev option to npm shrinkwrap, since our dev tools are strongly shared among projects and thus constitute an important vulnerability.

@cscott I thought that npm would ignore package-lock.json when publishing libraries, so I've checked and it doesn't seem to be included in the package on purpose (see in unpkg for example).

From the docs:

One key detail about package-lock.json is that it cannot be published, and it will be ignored if found in any place other than the toplevel package. It shares a format with npm-shrinkwrap.json, which is essentially the same file, but allows publication. This is not recommended unless deploying a CLI tool or otherwise using the publication process for producing production packages.

If both package-lock.json and npm-shrinkwrap.json are present in the root of a package, package-lock.json will be completely ignored.

Seems like you'd need the shrinkwrap to get the effect you are looking for.

Ah, so Parsoid (a top-level app) including npm-shrinkwrap.json in its repo while domino (a library) including package-lock.json is entirely correct. That way domino developers are protected from unexpected packages, but upstream users of domino are free to chose alternate package resolutions to maximize library sharing.

I think we should have the same behavior we have for Composer, which is similar to the behavior of package-lock.json. Basically, we should be consistent with what we do with Composer (which is to not commit the lock file).

This really depends if you see MediaWiki as a library to be installed by a higher level project. Or if you see MediaWiki as the project itself. The former would argue no lock file (as it would be ignored anyways) the latter would argue yes lock file.

You cannot change dependencies of libraries, but you can change dependencies of projects (since you own the project). Excluding special plugins (like Composer merge plugin), what is in composer.json or package.json is fixed (unless you fork it). If it's fixed, then it doesn't matter if you include the lock or not, as the package manager will ignore it depending on the context you are using it. If you are using it as a project or for development it will use it, if you are loading it as a library it will not.

Therefore, I think we should commit package lock files (composer.lock and package-lock.json) and let the package manager decide to use it, or ignore it depending on the context.

For MediaWiki, the composer.json files of each extensions are merged in a huge blob of dependencies. We can not use composer.lock files there since there will irremediably be version conflict between extensions. And we can only have one version of each package in PHP.

Instead the merge result is sent to mediawiki/vendor.git which has a composer.lock file. That is the central ivory tower of composer dependencies suitable for deployment to the Wikimedia cluster.

For npm it is slightly different. , I guess each extensions/skins might well ship their own javascript modules locked at a specific value. The node modules are never merged together, and dependencies for each extensions are namespaced via the resource loader (as I understand it).

What I am trying to say, is the JavaScript and PHP dependencies are used differently in MediaWiki. JavaScript ones are namespaced and can be locked at any version and co exist with several versions (afaik). The PHP ones are all in the same global scope and need loose versions, the version matching all loose requirements end up being elected and package locked for deployment.

So it is probably fine to use a package-lock.json for extensions.

AdvancedSearch - I0f591d75d0a7b8755b446398e753507b36db15b1
WikibaseLexeme - Icc91a224e264c9b33df70254958b0af95a463d03

(Note CI uses npm 3.8 and does not run npm install when installing the extension T198919 . The node_modules have to be manually committed).

For npm it is slightly different. , I guess each extensions/skins might well ship their own javascript modules locked at a specific value. The node modules are never merged together, and dependencies for each extensions are namespaced via the resource loader (as I understand it).

Yes, ResourceLoader can provide namespacing. However, that is not relevant to this task because we do not (and cannot) register files from node_modules with MediaWiki.

This task is about production code for Node.js services, and (for MediaWiki core/extensions) only about development tools on the command-line in Jenkins (and locally) with Node.js, e.g. npm test.

Note that for MediaWiki core/extensions there are ones that use npm as the delivery mechanism for libraries that are then registered with resource loader (via some script that copies from node_modules to somewhere else, for example). The video.js example comes to mind. Also Extension:Popups uses npm too to bring Redux from npm. In Popups for example we lock dependencies in package.json not using carets because our CI npm doesn't support the lock file, although we do commit it so that developers get exact versions on subdependencies too.

In those cases, npm is used to bring the library to the project which would end up served by RL, not only for dev tooling, and package-lock.json would be very useful to ensure that all versions are locked tight and upgrades are explicit.

Popups already includes the lock file in the repo so that library upgrades are explicit and devs get the same versions of tools and libraries, but it would be great if CI supported it too, and all developers and projects in Wikimedia followed the same convention (npm 5 or 6+ and lock files committed always).


Slightly unrelated (or maybe not) but note that npm@3.8.9 was published 29/04/2016, more than 2 years ago, and the last 3.X version (3.10.10) from npm was published 2016-11-05 (1 year, 8 months ago), so it may be time to consider upgrading the tooling in our infrastructure.

npm 5 introduced the lock file support and better caching and more reliable and performant installs, and npm 6 has included security audits from nsp into the CLI by default.

Mvolz added a comment.Mar 21 2019, 9:41 AM

@Jdforrester-WMF you said elsewhere this was resolved?