Page MenuHomePhabricator

[minIO] Investigate packaging, install, security monitoring.
Open, MediumPublic

Description

Minio distributes source code, prepackaged .deb, etc. They do not have an apt repository. The packages are hosted on an https site alongside sha256 signature files. The site is: https://dl.min.io/server/minio/release/linux-amd64

One simple option is to write a script to fetch the .deb, verify the checksum, and add the .deb to our local repository. A prototype for that is already done.

Next step is to investigate how we monitor for security issues.

Event Timeline

More on vulnerability tracking. This isn't awesome but:

From https://dl.min.io/server/minio/release/linux-amd64 we can scrape the git tag of the latest package. We're scraping this already to fetch the deb for our internal apt repo. From https://api.github.com/repos/minio/minio/releases/tags/{tag} we can retrieve a json blob of release notes. Looking at prior release notes, in some cases they set a different "name" when it's a security update, in other cases they don't although the body mentions fixes for CVEs. So I guess we could notify by email for keywords like "cve" or "security" in name or body, so we'd have a heads up when an update becomes available.

I feel like I'm missing something obvious...

There's a script in the internal frack "packages" repository that fetches the package and reports the portion of changelog associated with the latest package. The deb package is then added to the frack internal repository using "reprepro includedeb".

This script could be used to automate alerting for security-related changes, but it's not clear yet whether that's the right approach.

Changelog review, fetching, and packaging has been automated into a new FR admin tool called package_getter.

Still to do: vulnerability and update tracking.

Jgreen renamed this task from Investigate minio packaging, install, security monitoring. to Investigate MinIO packaging, install, security monitoring..Oct 4 2024, 2:07 PM
Jgreen triaged this task as Medium priority.
Jgreen added a project: fundraising-tech-ops.
Jgreen moved this task from Triage to In Progress on the fundraising-tech-ops board.
Jgreen renamed this task from Investigate MinIO packaging, install, security monitoring. to [minIO] Investigate packaging, install, security monitoring..Oct 18 2024, 7:26 PM

Initial host build completed in T377641. Still needs minio software install and iptables/pfw config completed.

Hi @Jgreen @Dwisehaupt - I've been reading along with this project a little and I think it's really exciting, but I have a question for you.. Hope that's OK.

Have you considered using Ceph in place of minIO for the S3 storage layer? We have considerable experience now within the Foundation in managing Ceph clusters of various shapes and sizes, so this might be an opportunity for us to collaborate and share this experience, rather than bring in another new product.

I note that the architectural proposal you received stated:

For storage, we recommend minIO, an open-source object storage technology that is API-compatible with AWS S3.
This API compatibility makes minIO a drop-in replacement for S3, and it is commonly used in on-premise environments that need an object storage solution.

Well, similarly, the Ceph Object Gateway is an S3 compatible, open-source storage technology.
Ceph as a project also has a thriving user community and broad adoption within the industry.

WMF already has both debian packages and container images of the latest stable version (codename: reef), which we are already using.
One of the clusters uses the new container-based tooling called cephadm to deploy the services, along with podman as a container runtime.
This approach might offer you a relatively easy way to try things out.

Here is a guide on how to run a Ceph cluster on a single server, in case you wanted to try this out.
Although, for a production service, I would always suggest starting with three servers so that you can aim for high-availability.

We have also started a Ceph SIG (special interest group) recently, which might be another channel for collaboration on this.

If you would like to learn a little more about the DPE Ceph cluster, feel free to look through this presentation I put together recently:

We have recently enabled the S3 interface on this cluster, so we can demonstrate how it works, if you are interested.
I'm also happy to try to answer any questions that you might have.

On the other hand, if you're happy to proceed with minIO that's fine too. I just thought it might be useful to discuss the option of collaborating on this part.

Initial puppet manifest and templates made for minio and deployed. Verified it is seeing the 3 hosts and 9 drives. Just rough manifests and configs for now, we'll definitely want to add more as we move along but we are in a state where we can test.

minio cli packaged and installed via puppet on the analytics_io hosts.