While investigating the performance of the cluster I noticed that some ods go
down for a couple seconds at a time and then come up again.
For example:
root@cloudcephosd1001:~# while true; do ceph osd tree | grep down; sleep 1; done 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 47 ssd 1.74609 osd.47 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000 12 ssd 1.74609 osd.12 down 1.00000 1.00000