Page MenuHomePhabricator

Connect Phabricator to swift for storage of git-lfs and file uploads.
Closed, DeclinedPublic

Description

Phabricator currently supports git lfs service, however, to set it up properly we need a storage back-end that is more scaleable than mysql tables or local disk on the phabricator nodes.

Since phabricator has a storage engine for amazon S3 and swift has an API that is similar to that of S3, the most obvious way forward for implementing lfs is to connect it to swift.

So I would like to request credentials for swift to be used for phabricator's lfs storage backend.

Event Timeline

Do you have any numbers wrt how much disk space and requests git-lfs is supposed to take? swift at the moment is "shared" in the sense that we have a single cluster in production, mainly for media storage and I wanted to gauge how much space and resources we might be expecting.

@fgiunchedi I don't have a very clear picture of disk usage given that it will be growing but I would expect that request volume will be pretty low. This is primarily to be used for our python projects which have large wheels repos that need to use lfs instead of plain git. Requests would be limited to developers committing, cloning and deploying. Given that all of our git repos in phab currently total ~35GB I wouldn't expect our lfs storage to exceed that amount at least for quite some time.

Thanks for the context! We have a swift deployment-prep cluster you could use for experiments, provided there's also phabricator in deployment-prep (?) or a similar testing place that is?

We have a phabricator instance in it's own project, however, I've never managed to maintain one in deployment-prep. Can we test it across multiple cloud projects?

Good question, I don't know if it swift in deployment prep is reachable now from other projects, worth a try though! An alternative would be to have a even a single machine swift cluster in phab's project.

Ottomata triaged this task as Medium priority.Jan 16 2018, 7:35 PM

Ping. Checking in on this since it's blocking our transition to git-lfs in ORES. Is someone working on this?

mmodell renamed this task from Requesting access to swift for Phabricator's git-lfs storage to Connect Phabricator to swift for storage of git-lfs and file uploads..Feb 26 2018, 11:31 AM
mmodell claimed this task.
mmodell edited projects, added Phabricator; removed SRE.
mmodell raised the priority of this task from Medium to High.Mar 21 2018, 11:26 PM

I finally made some good progress on the code last week. I should have something ready to play with this week.

Change 432528 had a related patch set uploaded (by 20after4; owner: 20after4):
[operations/puppet@production] Configuration for phabricator to use swift storage.

https://gerrit.wikimedia.org/r/432528

I think we're ready to try swift for phabricator in eqiad. I'm not opposed to trying this, though I have a few other questions/considerations:

  • You mentioned more scaleable storage than mysql or local disks, what are the requirements? I take it files are stored locally now on Phab* hosts. Those machines have spinning disks but we can easily replace those with SSDs if more I/O is needed.
  • Replication across datacenters would need to be addressed if the trial is successful, not sure what we're doing now for local files or what Phabricator capabilities are in this regard.

@fgiunchedi:

  • Currently all file uploads get stored in mysql database tables, not locally on the phab host. There shouldn't be a significant amount of traffic to the phabricator file store - it's mostly used for screenshots and whatnot.
  • Replication across datacenters would indeed be a good thing. We aren't quite multi-datacenter with phabricator yet but that is a goal for the near future.

status update:

As we are no longer pushing for phabricator to become our primary git service, git-lfs support is no longer a high priority. Nonetheless, swift storage for phabricator file uploads is desirable as it's much better than storing files in the database. So I still want to get this done but it's gotten pushed to the back burner for now.

mmodell lowered the priority of this task from High to Medium.Aug 6 2018, 4:21 PM
mmodell changed the task status from Open to Stalled.Jan 16 2019, 10:53 PM

There is a patch to phabricator to make this work, however, the production puppet to deploy it is untested & unfinished. It's not a team priority for Release-Engineering-Team right now and I don't think the acl*sre-team have any current plans to work on this either. So this is indefinitely stalled.

greg subscribed.

So this really isn't "externally blocked" anymore, I guess this could fall under some prioritized list for the "align with SRE best practices" goal we have for the year.

I will need significant assistance from SRE to bring this back to life.

Change 432528 abandoned by 20after4:
Configuration for phabricator to use swift storage.

Reason:
obsolete

https://gerrit.wikimedia.org/r/432528

mmodell changed the task status from Open to Stalled.Dec 5 2019, 10:17 PM
hashar subscribed.

Sounds like that was related to the migration to Diffential for code hosting and supporting the ORES repositories there. They are nowadays still hosted on GitHub with LFS enabled there.

there was maybe a suggestion of using it for files uploaded to phab?

There was maybe a suggestion of using it for files uploaded to phab?

Ideally maybe, then it seems storage of files into MySQL is still good enough for now. If that eventually ends up being a problem, sure we could look at migrating them to S3, but I don't think it is any necessary right now :)