Page MenuHomePhabricator

Guidelines for Rust/Go tools deployment
Open, LowPublic0 Estimated Story Points

Description

I started to do some research for T220669. On my shortlist of tools suitable for the job, one is in Rust, the other in Go (with a current preference for the Rust one so far).
I have some experience now with Python (packaging, etc..) with https://wikitech.wikimedia.org/wiki/Git-buildpackage
But have no idea what I should do for Rust/Go.

That's why I was wondering if there was any guidelines for Rust/Go app deployment in prod. (Most likely to a dedicated Ganeti VM). Which I think mostly comes down to: how do I package it the proper© way?
Asking about the two as the complexity of the packaging will in some way influence the decision (eg. if one is significantly more difficult).

Maybe those guidelines could be generic, but on the specifics of my tools, the Rust one can be very easily packaged into a deb using "cargo deb" ( see https://github.com/NLnetLabs/routinator/blob/master/Cargo.toml#L44 ). I've done it on my laptop.

Event Timeline

Joe triaged this task as Low priority.Jun 21 2019, 10:01 AM

While we wanted to get into a discussion about go packaging at the offsite, we had no time for it.

My current proposal for it would be the following:

  • Use "go mod" under golang >= 1.11 to create a frozen list of dependencies for the go project
    • If the project ships the dependencies already vendored, generate a go mod output matching those
    • If not, create the vendor directory using go mod and commit it as part of the sources
  • Simply build the package without dependencies on go source libraries this way

For rust, I have no real insight, maybe @MoritzMuehlenhoff has some experience with those?

+1 to what @Joe said, there are some challenges with that approach because there are go projects and libraries that would require the really latest go version so it could include a prerequisite of package golang itself to be used as a build dependency.

Rust has an alike tool like go mod https://doc.rust-lang.org/nightly/cargo/commands/cargo-vendor.html

There is also a cargo plugin that generates a debian package https://docs.rs/cargo-deb/1.18.0/cargo_deb/

+1 on using modules. Since go mod vendor is even an option, and it does some basic hash checking, it seems sensible. I agree it requires recent golang, but previous versions were packaging chaos in general. Since it's usually producing a statically-linked binary, it could be deployed from either a "scratch" docker image (in k8s), which has literally nothing else in it, or a deb package that wouldn't be terribly hard to generate compared to an interpreted executable or dynamically linked binary.

Rust has a strong offering with the cargo model that happens to be outside our rules. Using crates is mostly an issue at build time unless you are dynamically linking things (which is more common in places in rust, unlike go where this is mostly a discussion about build only). I suspect that using cargo-deb would be necessary and sensible for actually deploying things here on that. Since rust usually wants at least a runtime (though static linking is possible), I imagine that would streamline producing containers and packages for our consumption here. Crates are not normally produced as OS packages, so to use open source rust, you really cannot rely on debs entirely during build without a lot of work, from what I recall of my work on that.

Part of what I think the question is there is: Is having CI pull a crate or check out something for a go module equivalent to installing things outside of the rules? Or, do you have to do that locally and then commit it to your gerrit repo, which is functionally identical but obeys the letter of the rules more. What do we do for PHP libraries?

Part of what I think the question is there is: Is having CI pull a crate or check out something for a go module equivalent to installing things outside of the rules? Or, do you have to do that locally and then commit it to your gerrit repo, which is functionally identical but obeys the letter of the rules more. What do we do for PHP libraries?

Composer's package management system doesn't provide any system of signing or strong guarantees of reproducability, so we have invented a process to make things a bit more traceable/auditable for Wikimedia Foundation production MediaWiki deploys. Composer managed PHP libraries are manually submitted to the mediawiki/vendor.git repo for review before being available for use in the beta cluster and beyond in our deployment progression. The process is documented on mediawiki.org. This process is not beautiful, but it is serviceable. It came in part from the RFC that is now Manual:Developing_libraries and in part from implementation details that @hashar worked out when we were implementing that RFC in 2014.

In the CI environment, composer libraries are also pulled from other locations as defined in the composer.json files for projects under test. @hashar can probably give a lot more detail about that process if anyone wants to know how it works.

What Bryan said: for PHP we snapshot the dependencies in a git repository we control and prevent the build/deployment from downloading random code from the unknown. So for MediaWiki deployment to production, we have a mediawiki/vendor.git repository which contains a composer.json and composer.lock which we manually update (with something like:

  • composer require foobar==1.2.3
  • composer update
  • <review code>
  • git add && git commit

And then production uses that snapshot from mediawiki/vendor.git and never ever run composer update.

For Python we have a few different system, I am not sure we have a best practice set in stone. ORES has dependencies build as python wheels which are added to git repository using git LFS to offload the big files to somewhere else. The list of requirements is frozen in a dependency file that the pip package installer use to download the proper version. pip does have support for a sha checksum to validate the downloaded material, but I do not think we enforce it. For Zuul, I went with a Debian package that uses a virtualenv containing all the dependencies at the proper versions, but that is the only software I know of using that method, all others are using committed dependencies in a git repo and are deployed via scap.

For the mediawiki/services, we had the node_modules committed to a deployment repository having both the committed node_modules and the submodule pointing to the source code. An example would be mediawiki/services/parsoid. Most of those services are now being migrated to a deployment pipeline which runs npm install when generating a Docker container which is a way to snapshot the dependencies. A package-lock.json is committed in the repository to freeze the dependencies, then I don't think there are any specific validation.

We went through those systems because maintaining Debian packages for various languages and a myriad of dependencies is rather challenging. There are issues when different software have conflicting dependencies and might end up being installed on the same host, there are apt repo/preferences to be added which also could conflict or suddenly upgrade a package. So it is easier to just commit what we need. The Docker filesystem snapshot goes even a step further by also committing the system libraries.


For Go, the deployment pipeline does have support for go projects. The first project taking advantage of it is Release Pipeline (Blubber) and its deployment pipeline configuration is at https://phabricator.wikimedia.org/source/blubber/browse/master/.pipeline/ . If the utility is suitable for a deployment to Kubernetes, I highly recommend to use that system. Note that the go dependencies are committed to the repository under a ./vendor directory, and I think that is due to lack of a proper module manager in golang at the time we started Blubber. Dan would know more but he is out for a few weeks, regardless I am sure we can provide assistance.

For rust, in 2016 I have managed to create a Debian package using cargo. The devil is that rust was not available in Jessie at time, so I had to compile against. I then ran cargo directly when creating the package ( https://gerrit.wikimedia.org/r/#/c/operations/debs/geckodriver/+/294293/3/debian/rules ) T137797. Though that was without any control of what was downloaded. Anyway as others said above, there is a debian helper for cargo nowadays.


The summary in my opinion is that either go or rust are packageable in one way or another and you should just pick the best tool for your need regardless of its implementation. I would tend to favor a golang implementation since we have prior experience with it, but it should not prevent us to pick a rust based one if the software better suite your project requirements.

Then depending on the deployment target:

Kubernetes, the elected software should go via the deployment pipeline and use Blubber to have a Docker container generated that contains the generated binary.

Regular install, do a Debian package with dependencies committed in and the Debian package building the binary.

I don't think we can use scap, it does not really have support to do a build/compile, and I don't think the binary should be committed in. Scap is more appropriate for an interpreted language such as python or php?