Change Details

We have noticed that 14 of our ceph hosts are having sector errors on all the hard drives: {P52904} This errors are increasing over time, so it seems the drives are degrading (see the `changed` lines): {P52903} All those machines were bought in two batches, of which they are all the machines, so it might be a bad hard drive batch: {T291987} {T283888} These hosts are in service, and we can't take them all out at the same time, so we'll have to coordinate to replace/debug. I'll fill up the details with the logs/debugging from the wiki in a bit Thanks! List of affected hosts: cloudcephosd1021 - back online cloudcephosd1022 - back online cloudcephosd1023 - back online cloudcephosd1024 - back online cloudcephosd1025 - back online cloudcephosd1026 - drained, ready for upgradeto be drained cloudcephosd1027 - drained, ready for upgradeto be drained cloudcephosd1028 - drained, ready for upgradeto be drained cloudcephosd1029 - to be drained cloudcephosd1030 - to be drained cloudcephosd1031 - to be drained cloudcephosd1032 - to be drained cloudcephosd1033 - to be drained cloudcephosd1034 - to be drained