Page MenuHomePhabricator

Test the S3 and swift interfaces of rgw.eqiad.dpe.anycast.wmnet
Closed, ResolvedPublic

Description

We have installed a Ceph Storage cluster for the Data-Platform group in T324660

The Ceph Object Gateway has been enabled in T330152: Deploy ceph radosgw processes to data-engineering cluster and the load-balancing configuration for it has been enabled in T330153: Configure Anycast load-balancing ceph radosgw services on the data-engineering cluster

We can now proceed to testing the functionality by creating users and sending some sample API requests.

Users are managed as per the documentation here: https://docs.ceph.com/en/reef/radosgw/admin/#user-management

We have the s3cmd and swift CLI tools already available for testing on the stat servers.

Event Timeline

My first attempt to create a user with radosgw-admin didn't work as expected.

btullis@cephosd1004:~$ sudo radosgw-admin user create --uid=btullis --display-name="Ben Tullis" --email=btullis@wikimedia.org
2024-09-10T13:53:51.512+0000 7feb5e6a2d40  0 period (a36cfe74-11ce-488a-9f1b-03fe5c58b212 does not have zone f2a063b8-decc-4444-bca6-18f4a297bd5e configured
Please run the command on master zone. Performing this operation on non-master zone leads to inconsistent metadata between zones
Are you sure you want to go ahead? (requires --yes-i-really-mean-it)

It looks like there may be an issue with the multizone configuration.

Ah, maybe I didn't set the zonegroup to be master, as per: https://docs.ceph.com/en/latest/radosgw/multisite/#create-a-master-zonegroup
I will quickly review the commands I issued in T330152#10077357 and immediately afterwards.

I created the realm called dpe and set it to be default:

sudo radosgw-admin realm create --rgw-realm=dpe --default

I created a zonegroup called dpe_zg and set it to be default:

sudo radosgw-admin zonegroup create --rgw-zonegroup=dpe_zg --default

I created a zone called eqiad and set it to be master and the default:

sudo radosgw-admin zone create --rgw-zonegroup=dpe_zg --rgw-zone=eqiad --master --default --endpoints=https://rgw.eqiad.dpe.anycast.wmnet

I verified that I had not created a master zonegroup with:

btullis@cephosd1004:~$ sudo radosgw-admin zonegroup get
{
    "id": "5705578a-fc34-45fe-ab85-9cbd39e3aff5",
    "name": "dpe_zg",
    "api_name": "dpe_zg",
    "is_master": false,
    "endpoints": [],
    "hostnames": [],
    "hostnames_s3website": [],
    "master_zone": "f2a063b8-decc-4444-bca6-18f4a297bd5e",
<snip>

I executed the following to promote it to a master zonegroup:

btullis@cephosd1004:~$ sudo radosgw-admin zonegroup modify --rgw-zonegroup=dpe_zg --master
{
    "id": "5705578a-fc34-45fe-ab85-9cbd39e3aff5",
    "name": "dpe_zg",
    "api_name": "dpe_zg",
    "is_master": true,
    "endpoints": [],
    "hostnames": [],
    "hostnames_s3website": [],
    "master_zone": "f2a063b8-decc-4444-bca6-18f4a297bd5e",
<snip>

...which seemed to work.
I then updated the period, as explained here: https://docs.ceph.com/en/latest/radosgw/multisite/#updating-the-period
I'll copy the full output this time.

btullis@cephosd1004:~$ sudo radosgw-admin period update --commit
2024-09-10T14:23:26.023+0000 7f941e3d3d40  0 period (a36cfe74-11ce-488a-9f1b-03fe5c58b212 does not have zone f2a063b8-decc-4444-bca6-18f4a297bd5e configured
{
    "id": "a7c404c6-bf8f-4f06-9aa4-b381f79cd52a",
    "epoch": 1,
    "predecessor_uuid": "a36cfe74-11ce-488a-9f1b-03fe5c58b212",
    "sync_status": [],
    "period_map": {
        "id": "a7c404c6-bf8f-4f06-9aa4-b381f79cd52a",
        "zonegroups": [
            {
                "id": "5705578a-fc34-45fe-ab85-9cbd39e3aff5",
                "name": "dpe_zg",
                "api_name": "dpe_zg",
                "is_master": true,
                "endpoints": [],
                "hostnames": [],
                "hostnames_s3website": [],
                "master_zone": "f2a063b8-decc-4444-bca6-18f4a297bd5e",
                "zones": [
                    {
                        "id": "f2a063b8-decc-4444-bca6-18f4a297bd5e",
                        "name": "eqiad",
                        "endpoints": [
                            "https://rgw.eqiad.dpe.anycast.wmnet"
                        ],
                        "log_meta": false,
                        "log_data": false,
                        "bucket_index_max_shards": 11,
                        "read_only": false,
                        "tier_type": "",
                        "sync_from_all": true,
                        "sync_from": [],
                        "redirect_zone": "",
                        "supported_features": [
                            "compress-encrypted",
                            "resharding"
                        ]
                    }
                ],
                "placement_targets": [
                    {
                        "name": "default-placement",
                        "tags": [],
                        "storage_classes": [
                            "STANDARD"
                        ]
                    }
                ],
                "default_placement": "default-placement",
                "realm_id": "350a37b5-d907-4b0b-a680-b51cce916b02",
                "sync_policy": {
                    "groups": []
                },
                "enabled_features": [
                    "resharding"
                ]
            }
        ],
        "short_zone_ids": [
            {
                "key": "f2a063b8-decc-4444-bca6-18f4a297bd5e",
                "val": 2156933611
            }
        ]
    },
    "master_zonegroup": "5705578a-fc34-45fe-ab85-9cbd39e3aff5",
    "master_zone": "f2a063b8-decc-4444-bca6-18f4a297bd5e",
    "period_config": {
        "bucket_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "user_quota": {
            "enabled": false,
            "check_on_raw": false,
            "max_size": -1,
            "max_size_kb": 0,
            "max_objects": -1
        },
        "user_ratelimit": {
            "max_read_ops": 0,
            "max_write_ops": 0,
            "max_read_bytes": 0,
            "max_write_bytes": 0,
            "enabled": false
        },
        "bucket_ratelimit": {
            "max_read_ops": 0,
            "max_write_ops": 0,
            "max_read_bytes": 0,
            "max_write_bytes": 0,
            "enabled": false
        },
        "anonymous_ratelimit": {
            "max_read_ops": 0,
            "max_write_ops": 0,
            "max_read_bytes": 0,
            "max_write_bytes": 0,
            "enabled": false
        }
    },
    "realm_id": "350a37b5-d907-4b0b-a680-b51cce916b02",
    "realm_name": "dpe",
    "realm_epoch": 2
}

Now my create user command worked.

image.png (707×1 px, 68 KB)

I configured s3cmd with s3cmd --configure. I added:

  • Access key
  • Secret key
  • Default Region: dpe
  • S3 Endpoint rgw.eqiad.dpe.anycast.wmnet
  • DNS-style bucket+hostname:port template for accessing a bucket: n
  • Use HTTPS protocol: Yes

At the end of the configuration I got:

Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)

I could run the following to list buckets and it returned witout an error:

btullis@stat1008:~$ s3cmd ls
btullis@stat1008:~$

However, the command to make a bucket did not work:

btullis@stat1008:~$ s3cmd mb s3://test
ERROR: S3 error: 400 (InvalidLocationConstraint): The specified location-constraint is not valid

With a little research, I found that I was able to make a bucket with the slightly modified command:

btullis@stat1008:~$ s3cmd --bucket-location=":default-placement" mb s3://test
Bucket 's3://test/' created

I was then able to list the bucket with:

btullis@stat1008:~$ s3cmd ls
2024-09-10 15:40  s3://test

So I think that I need to set this default placement group correctly in the zonegroup.

After the creation of the bucket, I could see that we have a new pool on the cluster, called eqiad.rgw.buckets.index

btullis@cephosd1004:~$ sudo ceph osd pool ls
.mgr
rbd-metadata-ssd
rbd-metadata-hdd
rbd-data-ssd
rbd-data-hdd
dse-k8s-csi-ssd
.rgw.root
eqiad.rgw.log
eqiad.rgw.control
eqiad.rgw.meta
eqiad.rgw.buckets.index

I was able to upload a text file to the bucket.

btullis@stat1008:~$ echo -e "Mary had a little lamb\nits fleece was white as snow." > mary.txt
btullis@stat1008:~$ s3cmd put mary.txt s3://test
upload: 'mary.txt' -> 's3://test/mary.txt'  [1 of 1]
 53 of 53   100% in    2s    18.38 B/s  done

After uploading this small file, we could see that there was another pool created: eqiad.rgw.buckets.data

btullis@cephosd1004:~$ sudo ceph osd pool ls
.mgr
rbd-metadata-ssd
rbd-metadata-hdd
rbd-data-ssd
rbd-data-hdd
dse-k8s-csi-ssd
.rgw.root
eqiad.rgw.log
eqiad.rgw.control
eqiad.rgw.meta
eqiad.rgw.buckets.index
eqiad.rgw.buckets.data

I created a swift subuser for my account with: sudo radosgw-admin subuser create --uid=btullis --subuser=btullis:swift --access=full
This was the output.

image.png (921×1 px, 83 KB)

I was able to list the containers with the swift CLI and show the contents of the test container.

swift -A https://rgw.eqiad.dpe.anycast.wmnet/auth/1.0 -U btullis:swift -K 'REDACTED' --os-storage-url=https://rgw.eqiad.dpe.anycast.wmnet/swift/v1 list test
mary.txt

I could make it a little simpler with:

export ST_AUTH=https://rgw.eqiad.dpe.anycast.wmnet/auth/1.0
export ST_USER=btullis:swift
export ST_KEY=REDACTED
swift --os-storage-url=https://rgw.eqiad.dpe.anycast.wmnet/swift/v1 download test mary.txt -o -
Mary had a little lamb
its fleece was white as snow.

Interestingly, I still had to add the -os-storage-url=https://rgw.eqiad.dpe.anycast.wmnet/swift/v1 and this is because the radosgw process does not know about the envoyproxy.

When I ran it in debug mode without that option, I could see that that the x-storage-url header was pointing to the HTTP service.

DEBUG:swiftclient:RESP HEADERS: {'x-storage-url': 'http://rgw.eqiad.dpe.anycast.wmnet/swift/v1'

It is possible that it might be more efficient for us to take enoy out of the picture and use the built-in HTTPS support of beast:
https://docs.ceph.com/en/reef/radosgw/frontends/#options

However, I don't think that's urgent, for now. We know how to specify the option and we have no immediate plans to use the swift interface anyway.

I would like to know why the make bucket command from s3cmd required the --bucket-location=":default-placement" option.

I have configured all of the radosgw related pool to use the SSDs, except eqiad.rgw.buckets.data

btullis@cephosd1004:~$ sudo ceph osd pool set .rgw.root crush_rule ssd
set pool 8 crush_rule to ssd

btullis@cephosd1004:~$ sudo ceph osd pool set eqiad.rgw.log crush_rule ssd
set pool 9 crush_rule to ssd

btullis@cephosd1004:~$ sudo ceph osd pool set eqiad.rgw.control crush_rule ssd
set pool 10 crush_rule to ssd

btullis@cephosd1004:~$ sudo ceph osd pool set eqiad.rgw.meta crush_rule ssd
set pool 11 crush_rule to ssd

btullis@cephosd1004:~$ sudo ceph osd pool set eqiad.rgw.buckets.index crush_rule ssd
set pool 12 crush_rule to ssd

This was a configuration decision discussed with @MatthewVernon here: https://phabricator.wikimedia.org/T326945#9041446

The reason for this is that we want all of the metadata about the objects stored to be on the fastest and most reliable storage devices (i.e. the SSDs), but the default placement of the data itself should be on the HDDs.

We can add different rules for custom placement of objects later on, if we feel that we need to be able to provide S3 buckets on SSD.

So for now the only things we expect to be on the HDDs are the contents of the S3 buckets.

btullis@cephosd1004:~$ for p in $(sudo ceph osd pool ls) ; do echo $p; sudo ceph osd pool get $p crush_rule ; echo "---" ; done
.mgr
crush_rule: ssd
---
dse-k8s-csi-ssd
crush_rule: ssd
---
.rgw.root
crush_rule: ssd
---
eqiad.rgw.log
crush_rule: ssd
---
eqiad.rgw.control
crush_rule: ssd
---
eqiad.rgw.meta
crush_rule: ssd
---
eqiad.rgw.buckets.index
crush_rule: ssd
---
eqiad.rgw.buckets.data
crush_rule: hdd
---

I have now fixed the issue that required the --bucket-location=":default-placement" workaround when creating buckets with s3cmd.

btullis@stat1008:~$ s3cmd mb s3://test2
Bucket 's3://test2/' created
btullis@stat1008:~$

The issue seems to have been caused by the fact that our zonegroup was called dpe_zg instead of dpe. The original decision to do that was made in T330152#10077166

When I ran the command: s3cmd -d mb s3://test2 I noticed the following in the output.

DEBUG: bucket_location: <CreateBucketConfiguration><LocationConstraint>dpe</LocationConstraint></CreateBucketConfiguration>

I think that it can only have been getting this from the hostname component, because it hadn't made any HTTPS calls to the service by this time.

When I ran the command with the workaround: s3cmd -d --bucket-location=":default-placement" mb s3://test2 I saw that this was copied into the same constraint.

DEBUG: bucket_location: <CreateBucketConfiguration><LocationConstraint>:default-placement</LocationConstraint></CreateBucketConfiguration>

I checked the documentation here: https://docs.ceph.com/en/latest/radosgw/placement/#s3-bucket-placement

Normally, the LocationConstraint must match the zonegroup’s api_name:

I decided that it would be fine to rename our zonegroup to try this out. I issued the command:

sudo radosgw-admin zonegroup rename --rgw-zonegroup=dpe_zg --zonegroup-new-name=dpe

However, that didn't change the api_name field.

btullis@cephosd1001:~$ sudo radosgw-admin zonegroup get
{
    "id": "5705578a-fc34-45fe-ab85-9cbd39e3aff5",
    "name": "dpe",
    "api_name": "dpe_zg",
    "is_master": true,
    "endpoints": [],
    "hostnames": [],
<snip>

So I saved the JSON to a local file, edited it, then updated the settings with:

btullis@cephosd1001:~$ sudo radosgw-admin zonegroup set --infile dpe_zonegroup.json
{
    "id": "5705578a-fc34-45fe-ab85-9cbd39e3aff5",
    "name": "dpe",
    "api_name": "dpe",
    "is_master": true,
    "endpoints": [],
    "hostnames": [],
<snip>

I then updated the period with:

sudo radosgw-admin period update --commit

And restarted all five radosgw services (although I'm not certain that this was necessary).

Then my test started working.

When I ran the command: s3cmd -d mb s3://test2 I noticed the following in the output.

DEBUG: bucket_location: <CreateBucketConfiguration><LocationConstraint>dpe</LocationConstraint></CreateBucketConfiguration>

I think that it can only have been getting this from the hostname component, because it hadn't made any HTTPS calls to the service by this time.

Oh, in fact it was simpler than this, I had the following configuration in my .s3cfg file, which I must have answered during the initial setup.

bucket_location = dpe

Oh well, I'm happy with the renamed zonegroup anyway.

I think we can call this done.