After a change in the autoscale setting, the cluster started adapting to a new pg_num and reporting slow operations on osd.44.
The cluster stabilized on HEALTH_WARNING with some PGs unable to get allocated and osd.44 misbehaving.
Tried restarting the osd.44 service on cloudcephosd1005 and ended up with the service down due to:
● ceph-osd@44.service - Ceph object storage daemon osd.44
Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: enabled)
Active: active (running) since Wed 2020-11-25 08:37:24 UTC; 5min ago
Process: 7686 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id 44 (code=exited, status=0/SUCCESS)
Main PID: 7690 (ceph-osd)
Tasks: 59
Memory: 1.7G
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@44.service
└─7690 /usr/bin/ceph-osd -f --cluster ceph --id 44 --setuser ceph --setgroup ceph
Nov 25 08:37:24 cloudcephosd1005 systemd[1]: Starting Ceph object storage daemon osd.44...
Nov 25 08:37:24 cloudcephosd1005 systemd[1]: Started Ceph object storage daemon osd.44.
Nov 25 08:37:30 cloudcephosd1005 ceph-osd[7690]: 2020-11-25 08:37:30.314 7f56c8a01c80 -1 osd.44 106484 log_to_monitors {default=true}
Nov 25 08:37:30 cloudcephosd1005 ceph-osd[7690]: 2020-11-25 08:37:30.322 7f56c8a01c80 -1 osd.44 106484 mon_cmd_maybe_osd_create fail: 'osd.44 has already bound to class 'ssd', can not reset class to 'hdd'; use 'ceph osd crush rm-device-class <id>' to remove old class first': (16) Device or resource busyThe hdd class does not really exist in the cluster (afaics):
root@cloudcephosd1005:/var/lib/ceph/osd/ceph-44# ceph osd crush class ls
[
"ssd"
]And the osd.44 is already in the ssd class:
root@cloudcephosd1005:/var/lib/ceph/osd/ceph-44# ceph osd crush get-device-class osd.44 ssd
Tried removing the class and re-adding again for that osd with no changes:
root@cloudcephosd1005:/var/lib/ceph/osd/ceph-44# ceph osd crush rm-device-class osd.44 done removing class of osd(s): 44 root@cloudcephosd1005:/var/lib/ceph/osd/ceph-44# ceph osd crush set-device-class ssd osd.44 set osd(s) 44 to class 'ssd'