Page MenuHomePhabricator

cloudcephosd1017 /dev/sdg (osd.132) failed
Closed, ResolvedPublic

Description

[37746786.414774] scsi 0:0:6:0: rejecting I/O to dead device
[37746786.414775] print_req_error: I/O error, dev sdg, sector 175978944
Mar 01 21:00:00 cloudcephosd1017 systemd[1]: Started Ceph object storage daemon osd.132.
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.362+0000 7f2983904e00  0 set uid:gid to 499:499 (ceph:ceph)
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.362+0000 7f2983904e00  0 ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable), process ceph-osd, pid 2959
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.362+0000 7f2983904e00  0 pidfile_write: ignore empty --pid-file
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.370+0000 7f2983904e00 -1 bluestore(/var/lib/ceph/osd/ceph-132/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-132/block: (5) Input/output error
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.370+0000 7f2983904e00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-132: (2) No such file or directory
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.370+0000 7f2983904e00 -1 bluestore(/var/lib/ceph/osd/ceph-132/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-132/block: (5) Input/output error
Mar 01 21:00:00 cloudcephosd1017 ceph-osd[2959]: 2024-03-01T21:00:00.370+0000 7f2983904e00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-132: (2) No such file or directory
Mar 01 21:00:00 cloudcephosd1017 systemd[1]: ceph-osd@132.service: Main process exited, code=exited, status=1/FAILURE

Event Timeline

taavi triaged this task as Low priority.
taavi raised the priority of this task from Low to High.Mar 2 2024, 9:25 AM
dcaro changed the task status from Open to In Progress.Mar 4 2024, 12:35 PM
dcaro claimed this task.

I manually destroyed the osd:

ceph osd destroy 132

So it will need to be recreated when the disk is fixed (or the new one arrives):

ceph-volume lvm zap /dev/sdX
ceph-volume lvm create --osd-id 132 --data /dev/sdX

Back up and running