Page MenuHomePhabricator

Support Openstack Swift APIs via the radosgw
Closed, ResolvedPublic

Description

The Trove docs assume the presence of an object store (Swift). It looks like this is only used for backups but I also haven't found any clear evidence that Trove can run in a setup without object stores.

This is a good excuse to get Swift up and running with Trove its initial user.

Event Timeline

Apparently we can continue to put all our eggs in the ceph basket in Swift as well https://keithtenzer.com/2017/03/30/openstack-swift-integration-with-ceph/

That's what I was thinking, although it's a bit questionable to have the primary databases on ceph-hosted VMs and then 'back them up' to ceph-backed swift when it's the same ceph.

A separate pool is one level of separation, but it's not physical. It seems a second cluster for some uses would make sense for that, which would reduce our skill needs without sacrificing redundancy.

I don't think we need to actually run swift in wmcs; radosgw supports swift APIs.

https://docs.ceph.com/en/latest/radosgw/index.html

And it can consume Keystone for auth:

https://docs.ceph.com/en/latest/radosgw/keystone/

Of course VMs can't actually talk to ceph APIs, but maybe with haproxy that doesn't matter?

Change 682317 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloud-vps ceph: initial config for adding radosgw to control nodes

https://gerrit.wikimedia.org/r/682317

Change 682317 merged by Andrew Bogott:

[operations/puppet@production] cloud-vps ceph: initial config for adding radosgw to control nodes

https://gerrit.wikimedia.org/r/682317

Change 682729 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Ceph: Try to re-centralize radosgw keyrings

https://gerrit.wikimedia.org/r/682729

Change 682729 merged by Andrew Bogott:

[operations/puppet@production] Ceph: Try to re-centralize radosgw keyrings

https://gerrit.wikimedia.org/r/682729

Andrew renamed this task from Swift before Trove? to Support Openstack Swift APIs via the radosgw.Apr 28 2021, 4:18 PM
Andrew updated the task description. (Show Details)

We discussed this in our weekly meeting today. Issues to sort out before radosgw goes in eqiad1 are:

  1. Ceph pg audit to make sure we can support the new pools that radosgw needs
  1. Moving haproxy (and possibly radosgw) onto separate hardware so that the Swift bandwidth doesn't overwhelm the other openstack APIs.

I recommend we look at erasure coding the pools for Swift, since I don't know if I've said it directly related to this yet https://ceph.io/planet/erasure-coding-in-ceph/

That would give us more storage for the number of OSDs involved. Erasure coding for RBD was relatively new when we deployed ceph, so we avoided it for the safer replicated version, but it's quite mature and common in a radosgw config.

I recommend we look at erasure coding the pools for Swift, since I don't know if I've said it directly related to this yet https://ceph.io/planet/erasure-coding-in-ceph/

I think that's pretty simple -- radosgw creates the pools but you can convert them to erasure after the fact. Do you have an opinion about what ratio we should use?

Nope, I'd probably google around for what "seems good", unfortunately.

aborrero triaged this task as Medium priority.May 11 2021, 4:21 PM
aborrero moved this task from Soon! to Blocked on the cloud-services-team (Kanban) board.

[I was pointed at this task from IRC, I'm new in data persistence team, used to do quite a bit of Ceph at the Sanger]

I think 4+2 or 8+2 or 8+3 are relatively commonly-used. CERN were using 8+3 for their EC pools last time I heard, and report a 2xCPU 1.5xRAM overhead[0] compared to replicated storage. AIUI, our Ceph cluster is still quite small (in terms of # of hosts), so I'm not sure if we can do this sort of thing without compromising on host as a failure domain?

We kept meaning to move to EC for some of our pools at my last job, but it never quite made it to the top of the list, and we were using hdd + NVME for block.db, so disk capacity was "cheap".

There isn't really much QoS available yet to e.g. stop your RGW service using all the available IO (might be some coming in Quincy?); at Sanger we needed haproxy in front of our RGW service anyway, at which point that provides a mechanism for per-IP limits and suchlike. I talked a bit about this in a past Cephalocon presentation, so probably still have some slides on it somewhere...

[0] https://indico.cern.ch/event/941278/contributions/4104604/attachments/2147359/3619781/ErasureCoding20201120.pdf

Change 962707 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add radosgw apis to eqiad1

https://gerrit.wikimedia.org/r/962707

Change 962709 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[labs/private@master] Add fake radosgw eqiad1 key data

https://gerrit.wikimedia.org/r/962709

Change 962709 merged by Andrew Bogott:

[labs/private@master] Add fake radosgw eqiad1 key data

https://gerrit.wikimedia.org/r/962709

Change 962713 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] eqiad1: add caps for radosgw user

https://gerrit.wikimedia.org/r/962713

Change 962713 merged by Andrew Bogott:

[operations/puppet@production] eqiad1: add caps for radosgw user

https://gerrit.wikimedia.org/r/962713

Change 962707 merged by Andrew Bogott:

[operations/puppet@production] Add radosgw apis to eqiad1

https://gerrit.wikimedia.org/r/962707

Change 962743 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] radosgw: include a few missing pieces for eqiad1

https://gerrit.wikimedia.org/r/962743

Change 962743 merged by Andrew Bogott:

[operations/puppet@production] radosgw: include a few missing pieces for eqiad1

https://gerrit.wikimedia.org/r/962743

Change 962744 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Remove profile::cloudceph::client::rbd_glance

https://gerrit.wikimedia.org/r/962744

Change 962744 merged by Andrew Bogott:

[operations/puppet@production] Remove profile::cloudceph::client::rbd_glance

https://gerrit.wikimedia.org/r/962744

Change 965549 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] ceph radosgw: don't allow the 'reader' role to create/delete objects

https://gerrit.wikimedia.org/r/965549

Change 965549 merged by Andrew Bogott:

[operations/puppet@production] ceph radosgw: don't allow the 'reader' role to create/delete objects

https://gerrit.wikimedia.org/r/965549

Change 965670 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] radosgw: Enforce header name and CSP

https://gerrit.wikimedia.org/r/965670

Change 965670 merged by Majavah:

[operations/puppet@production] radosgw: Enforce header name and CSP

https://gerrit.wikimedia.org/r/965670

taavi claimed this task.