Page MenuHomePhabricator

Support hosting Rust tools on Toolforge
Open, LowPublic

Description

I have experimented with writing small Wikidata tools using Rust. I was wondering what a good way of hosting Rust tools on Toolforge is.

It would be nice if this ticket could result in a guide how best to deploy Rust tools and if they need a specialized container or some special configurations to an existing container (e.g. should cargo or rust nightly be available).

The example is here:

https://crates.io/crates/wikibase_rs_rocket_example

https://gitlab.com/tobias47n9e/wikibase_rs_rocket_example

Event Timeline

Are you looking for guidance on a webservice tool, or a command line script?

Vvjjkkii renamed this task from Decide on how to host Rust tools on Toolforge to yrcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii removed Tobias1984 as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from yrcaaaaaaa to Decide on how to host Rust tools on Toolforge.Jul 2 2018, 4:01 PM
CommunityTechBot assigned this task to Tobias1984.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
Bstorm added a subscriber: Bstorm.

Current tooling would seem to allow anyone to use rust in a tool's home dir. Establishing some kind of supported procedure for launching a rust service is another matter.

Bstorm moved this task from Doing to Inbox on the cloud-services-team (Kanban) board.
Legoktm renamed this task from Decide on how to host Rust tools on Toolforge to Support hosting Rust tools on Toolforge.Jun 2 2020, 6:11 AM
Legoktm added a subscriber: Sigma.

I spent today playing around with writing a Rust tool and trying to deploy it onto Toolforge (and failed). I'm still pretty new to Rust, so please correct me if I'm wrong. This is intended at deploying rust webservices on k8s (though the getting rust part is applicable to grid jobs too)

Getting rust

Debian buster/stretch has rustc 1.34.2 and cargo 0.35.0 packaged. This will probably work for simple rust applications, but depending on dependencies, some tools might want a newer version of rust, or even a nightly version. (The web framework I picked, rocket.rs, only supports nightly rust versions :|)

The offical/endorsed way to get rust is via rustup, a rust application that is typically downloaded via curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh. It downloads the rust toolchain for whatever version you want and sticks it in your home directory, and allows for different toolchain versions for different directories. It's worth noting that currently rustup has no security/integrity checking of binaries that it downloads besides HTTPS.

I think the ideal scenario is in the middle as described at https://github.com/rust-lang/rustup#working-with-distribution-rust-packages - we provide rustup, but by default it just uses the packaged rust/cargo. If people want a different version, they can use rustup to download it (e.g. rustup toolchain install nightly). My current thinking is that we bundle the rustup bash script with the docker container, and on first run, set everything up. Maybe via a webservice-rust-bootstrap script like Python.

Building applications

Usually done via cargo build --release, but each dependency will need to be compiled from scratch. This turned out to be incredibly slow on Toolforge. I did it on dev.tools.wmflabs.org aka tools-sgebastion-08.

toolforge-rust-slow.png (250×800 px, 19 KB)

A quick look via top showed that each individual rustc process never hit more than 5% CPU - maybe that's a Toolforge limit? We might want to consider bumping it for rust somehow. I left for dinner and when I came back, it still hadn't finished building.

Deploying a webservice

rust applications run their own HTTP server. Some frameworks like rocket allow configuring ports/hostnames/etc via environment variables but others like hyper seem to prefer hardcoding them. I think we can use a framework-neutral environment variable (e.g. RUST_PORT) or something to communicate that info to applications. Maybe a Toolforge rust crate provides the glue for frameworks we use.

Then start the web server via cargo run --release.

I would suggest we put applications in ~/www/rust, so you'd have ~/www/rust/Cargo.toml. I don't think we need a src subdirectory like Python since most rust code is already in src.

Is there a movement towards Rust in any significant segment of the Wikimedia movement? Are there clear benefits for Toolforge tools to be written and maintained in Rust?

I ask these questions not be snarky or mean, but from a place of genuine concern for our collective ability to support yet another language runtime and associated tool chain in Toolforge project. Each runtime we add has a long term support cost. Most users probably do not see that cost, but having spent the last 4 years supporting Toolforge and Cloud VPS as my full time job I certainly have. Things like this may become easier at some point in the future when Toolforge has support for build packs or some similar system for creating custom Docker images. Until then, I do not see a strong benefit for tool maintainers in adding official support of Rust that outweighs the costs of providing that support.

If someone wants to go outside the officially supported language runtimes, we will not do anything to stop you. But you will be on your own primarily for figuring out problems and working around them. By reading the source code of the webservice command and its related Python packages you can figure out which Kubernetes containers use the "GenericWebService" runner class at startup. GenericWebService is a runner that leaves everything up to the extra arguments passed to webservice start (or stored in $HOME/service.template). Using this one could either directly or through a small helper script execute the cargo run --release described by @Legoktm. The expected port will always be 8000 for Kubernetes webservices.

Usually done via cargo build --release, but each dependency will need to be compiled from scratch. This turned out to be incredibly slow on Toolforge. I did it on dev.tools.wmflabs.org aka tools-sgebastion-08.

toolforge-rust-slow.png (250×800 px, 19 KB)

A quick look via top showed that each individual rustc process never hit more than 5% CPU - maybe that's a Toolforge limit? We might want to consider bumping it for rust somehow. I left for dinner and when I came back, it still hadn't finished building.

IRC discussion was that this was caused by NFS limitations. @Bstorm said that the scratch NFS is faster/on newer hardware and suggested trying to build there

Is there a movement towards Rust in any significant segment of the Wikimedia movement?

I know of a few developers currently (Magnus, Sigma and Enterprisey). Personally I'm just trying to learn Rust. I would expect it would see more usage than the current golang (2 tools) and jdk8 (3 tools) images.

Are there clear benefits for Toolforge tools to be written and maintained in Rust?

Just the normal reasons people seem to really like Rust.

I ask these questions not be snarky or mean, but from a place of genuine concern for our collective ability to support yet another language runtime and associated tool chain in Toolforge project. Each runtime we add has a long term support cost. Most users probably do not see that cost, but having spent the last 4 years supporting Toolforge and Cloud VPS as my full time job I certainly have. Things like this may become easier at some point in the future when Toolforge has support for build packs or some similar system for creating custom Docker images. Until then, I do not see a strong benefit for tool maintainers in adding official support of Rust that outweighs the costs of providing that support.

I agree and don't think we're ready yet for Rust either. There are multiple show-stopper bugs from my POV, such as the NFS limitations. My timeline is implementing this in a year or two if I'm the primary person working on this and no new sudden need emerges.

If someone wants to go outside the officially supported language runtimes, we will not do anything to stop you. But you will be on your own primarily for figuring out problems and working around them. By reading the source code of the webservice command and its related Python packages you can figure out which Kubernetes containers use the "GenericWebService" runner class at startup. GenericWebService is a runner that leaves everything up to the extra arguments passed to webservice start (or stored in $HOME/service.template). Using this one could either directly or through a small helper script execute the cargo run --release described by @Legoktm. The expected port will always be 8000 for Kubernetes webservices.

I had some similar ideas at least for prototyping. Going forward though, I'd like to see if we can have a Rust image and webservice support available, but not "officially supported" by the Toolforge admins/WMCS team. The main benefits being that we can track usage through the k8s-images dashboard and it allows for experimentation/prototyping. Of course, I realize it's not actually possible to have an image that adds 0 extra work for everyone, which might make it not feasible.

Is there a movement towards Rust in any significant segment of the Wikimedia movement? Are there clear benefits for Toolforge tools to be written and maintained in Rust?

Speaking just for myself, I have some Rust tools in production:

  • PetScan
  • QuickStatements (the "run in background" bot doing the major gruntwork for now, but eventually all of the server-side code)
  • SourceMD (currently deactivated, needs some work, time- rather then tech-constrained)

I also wrote a Rust MediaWiki API crate (https://github.com/magnusmanske/mediawiki_rust ) and co-maintain a Wikibase crate (https://gitlab.com/tobias47n9e/wikibase_rs ).
That's both "backbone-level" tools, and Rust infrastructure to make it easier for others to interact with MediaWiki/Wikimedia installations.

Advantages of Rust:

  • speed
  • memory consumption
  • (memory) safety
  • MediaWiki API/SPARQL wrappers available and maintained

I ask these questions not be snarky or mean, but from a place of genuine concern for our collective ability to support yet another language runtime and associated tool chain in Toolforge project. Each runtime we add has a long term support cost. Most users probably do not see that cost, but having spent the last 4 years supporting Toolforge and Cloud VPS as my full time job I certainly have. Things like this may become easier at some point in the future when Toolforge has support for build packs or some similar system for creating custom Docker images. Until then, I do not see a strong benefit for tool maintainers in adding official support of Rust that outweighs the costs of providing that support.

I don't think special (as in, big time investment) support would be necessary. Two things that would help come to mind:

  • A paragraph in the docs (https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web ), to sum up what's required to get a (web-facing) tool up-and-running.
  • A way to get bearable compilation times on Toolforge. I read somewhere that the toolforge-dev servers are better for that kind of thing, but haven't tried. Document that way in the docs as above.

If someone wants to go outside the officially supported language runtimes, we will not do anything to stop you. But you will be on your own primarily for figuring out problems and working around them. By reading the source code of the webservice command and its related Python packages you can figure out which Kubernetes containers use the "GenericWebService" runner class at startup. GenericWebService is a runner that leaves everything up to the extra arguments passed to webservice start (or stored in $HOME/service.template). Using this one could either directly or through a small helper script execute the cargo run --release described by @Legoktm. The expected port will always be 8000 for Kubernetes webservices.

That's what I mean. Instead of "read our source code and guess what's what", one paragraph in the docs would suffice. (Yes, I saw Other / generic web servers).

If someone wants to go outside the officially supported language runtimes, we will not do anything to stop you. But you will be on your own primarily for figuring out problems and working around them. By reading the source code of the webservice command and its related Python packages you can figure out which Kubernetes containers use the "GenericWebService" runner class at startup. GenericWebService is a runner that leaves everything up to the extra arguments passed to webservice start (or stored in $HOME/service.template). Using this one could either directly or through a small helper script execute the cargo run --release described by @Legoktm. The expected port will always be 8000 for Kubernetes webservices.

That's what I mean. Instead of "read our source code and guess what's what", one paragraph in the docs would suffice. (Yes, I saw Other / generic web servers).

I started https://wikitech.wikimedia.org/wiki/User:Legoktm/Rust_on_Toolforge - please add/edit or let me know if anything you'd like to know is missing. We can link to it from the main Toolforge/Web documentation.

<snip>

  • A way to get bearable compilation times on Toolforge. I read somewhere that the toolforge-dev servers are better for that kind of thing, but haven't tried. Document that way in the docs as above.

My attempt to compile on dev.toolforge.org was just as slow (I gave up after an hour). Putting everything on /dev/scratch, which uses a different set of NFS servers AIUI, didn't seem to help much either.

Is there any reason not to run a compile via the grid with jsub? The reason the compile is slow is the cgroups. Once you are using a certain amount of CPU and RAM, you get niced and throttled by them. Grid nodes are up to spec for that kind of job as long as it requests a large amount of RAM, and I seriously doubt a compile in rust-land is IO constrained or would be problematic for NFS. I could test the theory by running a build on a non-bastion node if anyone has a good example, but I'm pretty sure I will not be surprised if I do. Once it's built, then you'll hit network and IO bottlenecks during the actual running of the app, but there are still benefits in processing power and memory safety I could imagine.

I know when I'm messing with go, I never compile inside Toolforge directly. I just rsync things after cross-compiling on my Mac. I haven't tried that with rust. The best guide I am aware of is: https://wiki.archlinux.org/index.php/Rust#Cross_compiling . I imagine that would be the best option for everyone in Toolforge. LLVM is a monster.

Is there a movement towards Rust in any significant segment of the Wikimedia movement?

I'm currently using Rust in the Templatehoard tool, both to generate dumps of template instances and to serve the non-static website. The template dumps as well as indices of the language headers found in each entry (currently generated in the Templatehoard home directory) are in turn used by the Digero tool.

On my own computer, I use another subcommand in the template-dumping program that creates various lists of headers used in English Wiktionary entries, and another Rust program that allows me to process the XML page dumps using Lua scripts. I had written the template, header, and Lua script programs in C, but it was painful to add new features and since I couldn't find a wikitext parser written in C, the template dumping was not completely syntactically accurate and was not able to dump the template parameters to make it easier for other scripts to process the template instances. So I switched over to Rust, which has a wikitext parser (parse_wiki_text), is easier to write and debug, and is in the same range of speed as C.

At the moment the Templatehoard server is served using webservice --backend=gridengine generic. But I can try webservice golang111, as recommended on @Legoktm's page, the next time I restart it.

Regarding compile times, I just compile the Rust programs using jsub and it takes longer than on my computer, but I think never more than an hour. Once or twice I tried compiling directly in the terminal, but aborted it. It doesn't seem a good idea because cargo usually takes a lot of CPU and a decent amount of memory. On my computer compiling the programs can almost saturate all 4 CPUs when it's compiling multiple libraries at once (though this is configurable) and still takes several minutes when starting from a clean slate.

I don't really understand Kubernetes, but it would be convenient if we could build and run Rust programs there: invoking cargo run in the Rust project directory there, rather than invoking cargo build in the Grid Engine and then sending the path to the resulting binary to webservice. But I don't know if this makes sense.

Regarding the installation directory of Rust programs, I'd propose using ~/.cargo/bin rather than ~/rust because that's where cargo install normally puts a binary.

Is there a movement towards Rust in any significant segment of the Wikimedia movement?

In Toolforge, we currently have at four distinct tool maintainers (7 tools) that have at least attempted to run rust in one or more tool folders (based on the existence of the .cargo directory). Besides that I don't think the internal study group has deployed anything yet, so the Foundation itself is probably not running it yet. That might place it as more popular than ruby and perhaps less popular than java? I might have that backwards.

(Just lurking here ^_^)

I know when I'm messing with go, I never compile inside Toolforge directly. I just rsync things after cross-compiling on my Mac. I haven't tried that with rust.

Just thinking aloud − could it make sense to have the compilation done by Jenkins as a post-merge step? (Although per T157893 I understand that at the moment the build nodes can’t communicate to the Toolforge environment so that’s probably moot)

Just thinking aloud − could it make sense to have the compilation done by Jenkins as a post-merge step? (Although per T157893 I understand that at the moment the build nodes can’t communicate to the Toolforge environment so that’s probably moot)

If you are clever enough with CI, you could probably make something like travis do that as well, which can access things when set up right. Then compiling doesn't bother Toolforge a bit :)

Is there any reason not to run a compile via the grid with jsub?

Thanks, that worked pretty well when I tried it, it was only 2x slower than my laptop, which is fast enough IMO all things considered. I updated https://wikitech.wikimedia.org/wiki/User:Legoktm/Rust_on_Toolforge with a compiling section.

I don't really understand Kubernetes, but it would be convenient if we could build and run Rust programs there: invoking cargo run in the Rust project directory there, rather than invoking cargo build in the Grid Engine and then sending the path to the resulting binary to webservice. But I don't know if this makes sense.

That was my original idea (I proposed it above) but now I'm less convinced. Using cargo run means we need to include cargo/rustc/etc. in the container. Also, because it could take minutes for compilation to finish, your webservice would be down in the meantime.

My current idea is a two stage process in which we first build and then deploy. Here are some imaginary commands:

webservice rust build
> mv .cargo/bin/$TOOL .cargo/bin/$TOOL.bak # to support quick rollback
> cargo install --path ~/www/rust --locked # via grid or k8s or whatever is fast/recommended

webservice rust restart
> ./.cargo/bin/$TOOL

Maybe add some syntactic sugar like webservice rust restart --build that combines it all into one command. I think we want to make it straightforward for people to deploy locally built binaries but still provide a convenient workflow for people who want to build on Toolforge.

(Sidenote: I'm not a go person, but would that make sense for it as well? I feel like the current webservice setup is really geared toward interpreted languages, not compiled ones.)

Regarding the installation directory of Rust programs, I'd propose using ~/.cargo/bin rather than ~/rust because that's where cargo install normally puts a binary.

Good point, updated my documentation.


I'm going to put out a call for comments on cloud-l for people who are using/tried using Rust to share their experience/pain points, as well as document what tools people have written in Rust on-wiki.

Too be honest, I don't really understand why we need special docker images for Rust (or any compiled language for that matter). Cross-compiling works, per-user installation of rustup + jsub seems to work fine as well.

As for the benefits of Rust: I am using it right now for regex-searching through article texts of a dewiki dump file at 500 megabytes per second on my laptop and will most likely use it for most of my tools instead of Python in the future.

Too be honest, I don't really understand why we need special docker images for Rust (or any compiled language for that matter). Cross-compiling works, per-user installation of rustup + jsub seems to work fine as well.

For the grid, that works just fine. But for kubernetes webservices, it does need to run in a docker image.

Also eventually I think we should look into providing a system-wide installation of rust/cargo so we don't have multi GB ~/.rustup directories in a bunch of tools with basically all the same content (of course, nothing will prevent users from using rustup if they have special needs).

Also eventually I think we should look into providing a system-wide installation of rust/cargo so we don't have multi GB ~/.rustup directories in a bunch of tools with basically all the same content (of course, nothing will prevent users from using rustup if they have special needs).

I got around to this finally, see my announcement and the documentation.


At this point I think I'd like to finish the rust-hello-world tool I started last year, move https://wikitech.wikimedia.org/wiki/User:Legoktm/Rust_on_Toolforge to Help:Toolforge/Rust and link it in a few places, and then I think we can call this ticket resolved. Or are there other things people would like to see done first?

These likely need cross-linking via some other Toolforge pages.

The main thing I'm not satisfied with is the use of the golang111 (or any other generic webservice) to run these. Now that we have a toolforge-bullseye-std image for the new toolforge-jobs service, could we have a standalone webservice container? IIRC (from months ago), we would either need Python inside the container, or reimplement webservice-runner in Go or Rust so it could be in the container with minimal footprint.

IIRC (from months ago), we would either need Python inside the container, or reimplement webservice-runner in Go or Rust so it could be in the container with minimal footprint.

Yeah, or edit webservice somehow to not require it. I created T293552 some time ago to get it done.