Page MenuHomePhabricator

Migrate extdist to production and a better architecture
Open, Needs TriagePublic

Description

Splitting from T277249: Migrate extdist.wmflabs.org to Debian Buster because that task is going to be declined.

And maybe in production? (Ganeti VM might be an easier start)

It looks like we used to use Gerrit/Gitweb to generate the tarballs, but then Gitweb was replaced by Gitiles and the easy-to-abuse tar feature was no longer exposed so we (temporarily) switched to GitHub until T70122 in 2014 when we moved to a wmflabs.org VM. But given we link to these from mediawiki.org (example) and that people are expected to execute code based on them, it doesn't seem great to run in wmflabs long-term. Proxying would avoid the PII issue but not the contents issue.

What would it take to run this in prod instead?

Short answer:

  • figuring out whether it's acceptable to run composer in a prod VM
  • figuring out whether we're OK cloning submodules from GitHub (or any other Git host I suppose), see comments starting with T143969#2647761

Long answer:

ExtensionDistributor's key function set is: it includes submodules (necessary for VisualEditor primarily, but also some others), it installs composer dependencies and it generates gitinfo.json files for Special:Version.

The architecture is pretty bad, both the MW extension and extdist.wmflabs.org independently talk to Gerrit, so when a new commit is pushed, the two can get out of sync. We also don't provide long-term stable URLs for curl ... which people regularly ask for.

I wrote up a better proposal with Yuvi's help in Feb 2015 at https://www.mediawiki.org/wiki/Extension:ExtensionDistributor/tardist but it stalled for reasons I don't remember. Today if we wanted to deploy an API like this, it would go in k8s except k8s doesn't have proper disk storage, so we'd use swift instead and it just becomes a much more complicated thing than what I have time to develop/maintain.

If the two bullets are acceptable risk, then I'd be down to move the current setup into a Ganeti VM, it's already all puppetized anyways.

Event Timeline

There are 2 ways forward basically:

  • Moving current setup to Ganeti: needs OK from security team and SRE that we can run composer in prod and can clone from arbitrary Git URLs
  • Moving to k8s as a proper service: needs a plan, review by SRE (serviceops probably) and then someone to actually write the code