Page MenuHomePhabricator

labstore1002 not mounting all LVs after reboot
Closed, ResolvedPublic

Description

after an alarm on labstore1002 about high load, after reboot some labstore LVs can't be mounted:

root@labstore1002:~# pvs
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  PV             VG       Fmt  Attr PSize  PFree  
  /dev/md0       os       lvm2 a--   1.82t   1.71t
  /dev/md124     labstore lvm2 a--  10.91t 934.25g
  /dev/md126     labstore lvm2 a--  10.91t   2.91t
  /dev/md127     backup   lvm2 a--  18.19t   2.16t
  unknown device labstore lvm2 a-m  10.91t 959.24g
  unknown device labstore lvm2 a-m  10.91t      0 
root@labstore1002:~#
root@labstore1002:~# lvs
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  LV                   VG       Attr       LSize   Pool  Origin Data%  Meta%  Move Log Cpy%Sync Convert
  backup               backup   Vwi-a-tz--  40.00t space        25.76                                  
  journal              backup   -wi-a----- 128.00m                                                     
  jun9                 backup   Vri-a-tz--  40.00t space backup 25.64                                  
  safety               backup   Vri-a-tz--  40.00t space backup 22.17                                  
  space                backup   twi-a-tz--  16.00t              65.59  33.68                           
  maps                 labstore owi-aos---   6.00t                                                     
  maps20150820040006   labstore swi-a-s---   1.00t       maps   3.11                                   
  maps20150825040007   labstore swi-a-s---   1.00t       maps   1.60                                   
  maps20150826040006   labstore swi-a-s---   1.00t       maps   1.26                                   
  maps20150827040006   labstore swi-a-s---   1.00t       maps   0.95                                   
  others               labstore owi---s-p-  10.91t                                                     
  others20150826030006 labstore swi---s---   1.00t       others                                        
  others20150827030006 labstore swi---s---   1.00t       others                                        
  others20150828103405 labstore swi---s-p-   1.00t       others                                        
  others20150829030006 labstore swi---s---   1.00t       others                                        
  others20150830030044 labstore swi---s---   1.00t       others                                        
  scratch              labstore -wi-----p- 999.00g                                                     
  tools                labstore owi---s-p-   8.00t                                                     
  tools20150827020009  labstore swi---s---   1.00t       tools                                         
  tools20150828170740  labstore swi---s---   1.00t       tools                                         
  tools20150829020007  labstore swi---s---   1.00t       tools                                         
  tools20150830020008  labstore swi---s---   1.00t       tools                                         
  root                 os       -wi-ao----  18.62g                                                     
  var                  os       -wi-a-----  93.13g                                                     
root@labstore1002:~#

Event Timeline

fgiunchedi raised the priority of this task from to Needs Triage.
fgiunchedi updated the task description. (Show Details)
fgiunchedi subscribed.

trying to activate labstore/tools

root@labstore1002:~# lvchange -ay labstore/tools
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  Refusing activation of partial LV labstore/tools.  Use '--activationmode partial' to override.

(We have backups on labstore2001 for 2015-08-30T01:59:35.787Z)

metadata archive for lvm shows that pv uuids /etc/lvm/archive/labstore_00158-1828888375.vg

physical_volumes {

         pv0 {
                 id = "nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw"
                 device = "/dev/md125"   # Hint only

                 status = ["ALLOCATABLE"]
                 flags = []
                 dev_size = 23434094592  # 10.9124 Terabytes
                 pe_start = 6144
                 pe_count = 2860606      # 10.9123 Terabytes
         }

         pv1 {
                 id = "mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u"
                 device = "/dev/md124"   # Hint only

                 status = ["ALLOCATABLE"]
                 flags = []
                 dev_size = 23434100736  # 10.9124 Terabytes
                 pe_start = 6144
                 pe_count = 2860607      # 10.9123 Terabytes
         }
root@labstore1002:/etc/lvm/archive# pvdisplay /dev/md124
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  --- Physical volume ---
  PV Name               /dev/md124
  VG Name               labstore
  PV Size               10.91 TiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              2860607
  Free PE               239167
  Allocated PE          2621440
  PV UUID               R9jxHi-c6jP-JNb1-ErMi-vKX3-yCSx-omAdhQ
   
root@labstore1002:/etc/lvm/archive# pvdisplay /dev/md125
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  Failed to find physical volume "/dev/md125"
root@labstore1002:/etc/lvm/archive# pvdisplay /dev/md126
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  --- Physical volume ---
  PV Name               /dev/md126
  VG Name               labstore
  PV Size               10.91 TiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              2860607
  Free PE               763455
  Allocated PE          2097152
  PV UUID               rsI6Vw-CNBV-YPcE-G6aS-Xog3-XeGu-4VpSeU
   
root@labstore1002:/etc/lvm/archive# pvdisplay /dev/md127
  /dev/sdaw: read failed after 0 of 4096 at 0: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844081664: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 1999844139008: Input/output error
  /dev/sdaw: read failed after 0 of 4096 at 4096: Input/output error
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  --- Physical volume ---
  PV Name               /dev/md127
  VG Name               backup
  PV Size               18.19 TiB / not usable 8.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              4767678
  Free PE               565150
  Allocated PE          4202528
  PV UUID               fMhPRR-4bJ9-Am0o-AdKG-9o3m-E2CL-QMJQlZ
   
root@labstore1002:/etc/lvm/archive#

since labstore1002 controller has been flaky myself and @yuvipanda tried an hard reboot but without success, now /dev/md126 shows no lvm metadata

root@labstore1002:~# mdadm --detail /dev/md126
/dev/md126:
        Version : 1.2
  Creation Time : Tue Jul  7 15:16:11 2015
     Raid Level : raid10
     Array Size : 11717050368 (11174.25 GiB 11998.26 GB)
  Used Dev Size : 1952841728 (1862.38 GiB 1999.71 GB)
   Raid Devices : 12
  Total Devices : 12
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Sun Aug  2 17:14:57 2015
          State : clean 
 Active Devices : 12
Working Devices : 12
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : labstore:shelf44
           UUID : 178c44d8:10de2002:baca784b:6f1d26bb
         Events : 11504

    Number   Major   Minor   RaidDevice State
       0       8      192        0      active sync set-A   /dev/sdm
       1       8      208        1      active sync set-B   /dev/sdn
       2       8      224        2      active sync set-A   /dev/sdo
       3       8      240        3      active sync set-B   /dev/sdp
       4      65        0        4      active sync set-A   /dev/sdq
       5      65       16        5      active sync set-B   /dev/sdr
       6      65       32        6      active sync set-A   /dev/sds
       7      65       48        7      active sync set-B   /dev/sdt
       8      65       64        8      active sync set-A   /dev/sdu
       9      65       80        9      active sync set-B   /dev/sdv
      10      65       96       10      active sync set-A   /dev/sdw
      11      65      112       11      active sync set-B   /dev/sdx
root@labstore1002:~# pvdisplay /dev/md126
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  Failed to find physical volume "/dev/md126"
root@labstore1002:~# mdadm --detail /dev/md124
/dev/md124:
        Version : 1.2
  Creation Time : Tue Jul  7 15:13:44 2015
     Raid Level : raid10
     Array Size : 11717050368 (11174.25 GiB 11998.26 GB)
  Used Dev Size : 1952841728 (1862.38 GiB 1999.71 GB)
   Raid Devices : 12
  Total Devices : 12
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Sun Aug 30 11:17:12 2015
          State : clean 
 Active Devices : 12
Working Devices : 12
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : labstore:shelf23
           UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
         Events : 89262

    Number   Major   Minor   RaidDevice State
       0      65      224        0      active sync set-A   /dev/sdae
       1      66       64        1      active sync set-B   /dev/sdak
       2      65      240        2      active sync set-A   /dev/sdaf
       3      66       80        3      active sync set-B   /dev/sdal
       4      66        0        4      active sync set-A   /dev/sdag
       5      66       96        5      active sync set-B   /dev/sdam
       6      66       16        6      active sync set-A   /dev/sdah
       7      66      112        7      active sync set-B   /dev/sdan
       8      66       32        8      active sync set-A   /dev/sdai
       9      66      128        9      active sync set-B   /dev/sdao
      10      66       48       10      active sync set-A   /dev/sdaj
      11      66      144       11      active sync set-B   /dev/sdap
root@labstore1002:~# pvdisplay /dev/md124
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  --- Physical volume ---
  PV Name               /dev/md124
  VG Name               labstore
  PV Size               10.91 TiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              2860607
  Free PE               239167
  Allocated PE          2621440
  PV UUID               R9jxHi-c6jP-JNb1-ErMi-vKX3-yCSx-omAdhQ
   
root@labstore1002:~#
root@labstore1002:~# mdadm --detail /dev/md125
/dev/md125:
        Version : 1.2
  Creation Time : Tue Jul  7 15:14:44 2015
     Raid Level : raid10
     Array Size : 11717050368 (11174.25 GiB 11998.26 GB)
  Used Dev Size : 1952841728 (1862.38 GiB 1999.71 GB)
   Raid Devices : 12
  Total Devices : 12
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Sun Aug 30 09:57:25 2015
          State : clean 
 Active Devices : 12
Working Devices : 12
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : labstore:shelf32
           UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
         Events : 18493

    Number   Major   Minor   RaidDevice State
       0      66      160        0      active sync set-A   /dev/sdaq
       1      65      128        1      active sync set-B   /dev/sdy
       2      66      176        2      active sync set-A   /dev/sdar
       3      65      144        3      active sync set-B   /dev/sdz
       4      66      192        4      active sync set-A   /dev/sdas
       5      65      160        5      active sync set-B   /dev/sdaa
       6      66      208        6      active sync set-A   /dev/sdat
       7      65      176        7      active sync set-B   /dev/sdab
       8      66      224        8      active sync set-A   /dev/sdau
       9      65      192        9      active sync set-B   /dev/sdac
      10      66      240       10      active sync set-A   /dev/sdav
      11      65      208       11      active sync set-B   /dev/sdad
root@labstore1002:~# pvdisplay /dev/md125
  Couldn't find device with uuid nda3CC-Lx5b-j3QA-seRB-vj2t-3vPI-ZYCfrw.
  Couldn't find device with uuid mun2PU-aeD8-AE0L-Zof6-LUvc-E6i5-U3jP0u.
  --- Physical volume ---
  PV Name               /dev/md125
  VG Name               labstore
  PV Size               10.91 TiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              2860607
  Free PE               763455
  Allocated PE          2097152
  PV UUID               rsI6Vw-CNBV-YPcE-G6aS-Xog3-XeGu-4VpSeU
root@labstore1002:~# cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] [raid10] 
md124 : active (auto-read-only) raid10 sdaj[10] sdai[8] sdan[7] sdal[3] sdae[0] sdam[5] sdag[4] sdak[1] sdao[9] sdaf[2] sdap[11] sdah[6]
      11717050368 blocks super 1.2 512K chunks 2 near-copies [12/12] [UUUUUUUUUUUU]
      bitmap: 0/88 pages [0KB], 65536KB chunk

md125 : active (auto-read-only) raid10 sdau[8] sdar[2] sdav[10] sdaq[0] sdat[6] sdaa[5] sdas[4] sdab[7] sdy[1] sdac[9] sdz[3] sdad[11]
      11717050368 blocks super 1.2 512K chunks 2 near-copies [12/12] [UUUUUUUUUUUU]
      bitmap: 0/88 pages [0KB], 65536KB chunk

md126 : active (auto-read-only) raid10 sdv[9] sdp[3] sdu[8] sdr[5] sdt[7] sds[6] sdq[4] sdo[2] sdn[1] sdm[0] sdx[11] sdw[10]
      11717050368 blocks super 1.2 512K chunks 2 near-copies [12/12] [UUUUUUUUUUUU]
      bitmap: 0/88 pages [0KB], 65536KB chunk

md127 : active raid0 sdh[5] sdg[4] sdi[6] sdj[7] sde[1] sdc[3] sdl[9] sdd[0] sdf[2] sdk[8]
      19528417280 blocks super 1.2 512k chunks
      
md3 : active (auto-read-only) raid6 sday1[0] sdax1[11] sdaw1[10] sdbh1[9] sdbg1[8] sdbf1[7] sdbe1[6] sdbd1[5] sdbc1[4] sdbb1[3] sdba1[2] sdaz1[1]
      19528391680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/12] [UUUUUUUUUUUU]
      
md0 : active raid1 sda1[0] sdb1[1]
      1952839680 blocks super 1.2 [2/2] [UU]
      bitmap: 1/15 pages [4KB], 65536KB chunk

unused devices: <none>

I've tried to assemble the missing raid slices manually:

mdadm --assemble /dev/md/slice51 --uuid 0747643d:b89b36ff:57156095:c33694fc --verbose
mdadm --assemble /dev/md/slice15 --uuid 294ea55f:e0fe2c16:eba6f8a7:c4025a70 --verbose

that worked and now all PVs are seen, however the new arrays are missing disks

root@labstore1002:/dev# cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] [raid0] [raid10] 
md122 : active (auto-read-only) raid10 sdbo[1] sdbt[11] sdbs[9] sdbr[7] sdbq[5] sdbp[3]
      11717050368 blocks super 1.2 512K chunks 2 near-copies [12/6] [_U_U_U_U_U_U]
      bitmap: 0/88 pages [0KB], 65536KB chunk

md123 : active (auto-read-only) raid10 sdbi[1] sdbn[11] sdbm[9] sdbl[7] sdbk[5] sdbj[3]
      11717047296 blocks super 1.2 512K chunks 2 near-copies [12/6] [_U_U_U_U_U_U]

list of array uuid found by looking at all disks currently seen

root@labstore1002:/dev# for i in sd* ; do mdadm --examine $i | grep -e ^sd -e UUID -e Name ; done
sda:
sda1:
     Array UUID : 805a4496:1ab39c3a:6c326e92:3fba8980
           Name : labstore1002:0  (local to host labstore1002)
    Device UUID : 6ebefebd:95110bbd:7db27d8b:54ca8563
sdaa:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 43933455:a20d249a:abc816ab:3912df57
sdab:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 4569c3fa:b5409b89:06a84932:ef90f85b
sdac:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 91de860e:99f16eb6:d4e7af20:0c6bce44
sdad:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : aeda8cf0:5053b27c:1ef76480:c5b52616
sdae:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 73aa3390:3ff554b3:f1fcc2ff:13a93c31
sdaf:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : eed00a82:0302ae02:7b113037:b57a5e07
sdag:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : feaf6b49:bd174416:50291d81:cfad783b
sdah:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : e552fd44:a6b600a8:f4d58985:78300b72
sdai:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : f9f48a9e:eec2f30c:1b8e4857:d402399a
sdaj:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 85385f80:f46781d9:5122ebd3:89c220f1
sdak:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 43ba0f2c:f278a398:5dc73a9e:d85e0aa6
sdal:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 368283cd:7041465e:fdb31293:5f9c7243
sdam:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 2c82fafe:ea2a434a:df0decb5:10f7993a
sdan:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 9a9f22d7:7a7ab234:0db17d43:0bf7009c
sdao:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : 7269036b:a73f94b3:4fac8ce0:2959495d
sdap:
     Array UUID : 196dc9b1:955b6999:4bf3c8cf:185814d2
           Name : labstore:shelf23
    Device UUID : bde7f39a:35edd252:f30fb5a0:65886df6
sdaq:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 1e739ec8:15db4b1e:67e008c4:9e1a1b8d
sdar:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : debc10c4:d15c2321:f9ee7e87:7054d311
sdas:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 80daa7e6:9a997735:77491e0d:d9d0d12e
sdat:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 699d3807:3235eef6:c9bca196:989170e6
sdau:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 0d363a85:074a42ec:54c430f1:7ed75434
sdav:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 00efb780:c3145b06:df3f0d56:514b826c
sdaw:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 87d4a1ec:66354727:eace889e:b51f26bd
sdaw1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 25f42065:6944f5b4:6ef222b2:31f12155
sdax:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : fd6edcc5:e934c6dd:8e58021c:a8a28efe
sdax1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : e49fd35f:483414e7:a5648cf9:dd93ae40
sday:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : e622c212:3a06d1b5:bf5f66ff:fd04c058
sday1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 4af7f202:125280bf:ef53694b:0a9cd4df
sdaz:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 153ecf37:e54cdce4:28c14a3b:172cd7c3
sdaz1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 73fcf692:f8b0c1d9:f7107444:0a27ccad
sdb:
sdb1:
     Array UUID : 805a4496:1ab39c3a:6c326e92:3fba8980
           Name : labstore1002:0  (local to host labstore1002)
    Device UUID : eb757ba4:742dd71d:48d74ed5:5dce087e
sdba:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 9205bb3e:c58beea9:6209eced:da1566ba
sdba1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 0ed27278:43dcc25a:b90e21fb:5ab64c4e
sdbb:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 87eb906f:871deac2:ff0fde63:1b56c7a7
sdbb1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 6d9a9039:0913a74d:0f31992a:4a7478ba
sdbc:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 522d63ec:48d77c6c:5c84d30a:f640e7c3
sdbc1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 36183187:f8509ff6:2fe974c5:de5ab485
sdbd:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 246ca0f0:b6248dc7:90d9bbff:08c8e8a1
sdbd1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 650fd5d8:af4d1d86:1f4c7293:7220b931
sdbe:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : fa846e6f:869094eb:9ed3731a:65ba6f45
sdbe1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 0b3ea75d:d42fbb18:a90c9fec:ecc6bcd4
sdbf:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 0028ffd0:9c65eb0e:a1564291:71a8842b
sdbf1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 8bab521c:f9db3c58:625ea962:b99be05c
sdbg:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 1d52b645:7a1eac4e:a8b28717:443f64e3
sdbg1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : af1c9f9e:e20f41c5:5ac0b7c0:933dec44
sdbh:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 1b3e6255:68340f30:fbf65d35:6dac7c03
sdbh1:
     Array UUID : 433e6763:852c530e:a56becea:e8074bed
           Name : labstore1001:3
    Device UUID : 0d5bd9c6:32c550d0:19aace07:3f271c76
sdbi:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : d8f62679:987ed01d:53a7fa04:50f2f73a
sdbj:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 02ac30a6:a36598bc:a62167ba:c9289aa8
sdbk:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : b0c69dfe:2ecc9011:bef69a1e:7c918493
sdbl:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : c063b8a6:411f4046:3d5620af:5cd85c59
sdbm:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : 3eb5c268:56bea2f4:048e0914:680f13e7
sdbn:
     Array UUID : 0747643d:b89b36ff:57156095:c33694fc
           Name : labstore1001:slice51
    Device UUID : b3703f52:47ca35fd:4c685c24:2d9ae0c8
sdbo:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 90286d33:7821d6e7:c3a566bc:97e1f52f
sdbp:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 7a3aa2d1:baf2e72e:6a77231a:6de3f549
sdbq:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 98aeffa1:a2ad6d6e:f4d604ef:f2dd5532
sdbr:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : d061733e:07ed5dda:f7e8eec4:2c49bd78
sdbs:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : b1fdc291:afcde370:e890aa26:05f3f9a0
sdbt:
     Array UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
           Name : labstore1002:slice15  (local to host labstore1002)
    Device UUID : 9b7090b1:1ee3bc06:ccdace88:ccc3b40e
sdc:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 2707fb8b:3b09238b:803168c1:66828afb
sdd:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 13286cba:686026a6:a9c653f2:82ac9aeb
sde:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 48b9a396:3d9bd4a4:26113c6e:2c32d1f8
sdf:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 77d6e9d7:6478d4b4:e45d8a78:86c56205
sdg:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 912bf3b1:24d17c92:832e9043:c1355be4
sdh:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 20416282:9f821212:dc0b0cb4:278987f4
sdi:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 2d07b210:c83ed071:65554027:c3beedc6
sdj:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : dd0cea8c:7101c81a:54a9f56d:ffc45eaf
sdk:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 14eefde0:05bab714:7dc0a646:e9698899
sdl:
     Array UUID : 1748672f:ddf47a81:9db19a1b:33c1e793
           Name : labstore1002:backup  (local to host labstore1002)
    Device UUID : 52344615:bcff2ada:71b9063c:4b0629e7
sdm:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : c121aa21:a8f26b5c:ed3c519b:8ea1d07b
sdn:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 9b04692c:48d4cd2b:16579d35:9a578551
sdo:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 9069b551:0c69b9a7:f7e7361c:17a667dd
sdp:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 30319781:5085cc2d:ec7ade39:f019ce30
sdq:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 24bfb82a:7a595bc0:2c117ae6:ed85228c
sdr:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 1b39fe42:e4998a03:433b3031:77e2000e
sds:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : d3335df2:cdefadfa:de879778:87566b27
sdt:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 78a249ca:8d87aa9b:90c5339f:fa75f437
sdu:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 03782b7a:afa24619:21d825a4:5213ef78
sdv:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 819554d7:12da7fed:08e14281:1ee0b687
sdw:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 1d1298ce:6cf90f44:80c93cd6:49aaab3f
sdx:
     Array UUID : 178c44d8:10de2002:baca784b:6f1d26bb
           Name : labstore:shelf44
    Device UUID : 7c1f51bf:88265703:99d0440d:0a5974ff
sdy:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : 5b6708cb:7fa31527:423fc41e:fe4246e8
sdz:
     Array UUID : 9da853c9:4fecf6b8:3d68ba7d:c1d49909
           Name : labstore:shelf32
    Device UUID : dcb12e15:2128044f:b6b5030a:7c76efe3
root@labstore1002:/dev#

the new raid10 show missing half disks, possibly on purpose

root@labstore1002:/dev# mdadm  --detail /dev/md122
/dev/md122:
        Version : 1.2
  Creation Time : Thu Jun 18 12:46:18 2015
     Raid Level : raid10
     Array Size : 11717050368 (11174.25 GiB 11998.26 GB)
  Used Dev Size : 1952841728 (1862.38 GiB 1999.71 GB)
   Raid Devices : 12
  Total Devices : 6
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Sun Aug 30 09:57:25 2015
          State : clean, degraded 
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : labstore1002:slice15  (local to host labstore1002)
           UUID : 294ea55f:e0fe2c16:eba6f8a7:c4025a70
         Events : 37284

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1      68       32        1      active sync set-B   /dev/sdbo
       4       0        0        4      removed
       3      68       48        3      active sync set-B   /dev/sdbp
       8       0        0        8      removed
       5      68       64        5      active sync set-B   /dev/sdbq
      12       0        0       12      removed
       7      68       80        7      active sync set-B   /dev/sdbr
      16       0        0       16      removed
       9      68       96        9      active sync set-B   /dev/sdbs
      20       0        0       20      removed
      11      68      112       11      active sync set-B   /dev/sdbt
root@labstore1002:/dev# mdadm  --detail /dev/md123
/dev/md123:
        Version : 1.2
  Creation Time : Tue Jun 16 18:21:10 2015
     Raid Level : raid10
     Array Size : 11717047296 (11174.25 GiB 11998.26 GB)
  Used Dev Size : 1952841216 (1862.37 GiB 1999.71 GB)
   Raid Devices : 12
  Total Devices : 6
    Persistence : Superblock is persistent

    Update Time : Sun Aug 30 09:57:30 2015
          State : clean, degraded 
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

           Name : labstore1001:slice51
           UUID : 0747643d:b89b36ff:57156095:c33694fc
         Events : 17165

    Number   Major   Minor   RaidDevice State
       0       0        0        0      removed
       1      67      192        1      active sync set-B   /dev/sdbi
       4       0        0        4      removed
       3      67      208        3      active sync set-B   /dev/sdbj
       8       0        0        8      removed
       5      67      224        5      active sync set-B   /dev/sdbk
      12       0        0       12      removed
       7      67      240        7      active sync set-B   /dev/sdbl
      16       0        0       16      removed
       9      68        0        9      active sync set-B   /dev/sdbm
      20       0        0       20      removed
      11      68       16       11      active sync set-B   /dev/sdbn

assuming the missing drives are on purpose, activating the lv worked with lvchange -ay labstore/others and lvchange -ay labstore/tools (and nuking some 'others' snapshots so it is faster to activate)

after that mount all filesystem and run sync-exports to bind-mount to /exp and finally start-nfs to let traffic in

Aklapper triaged this task as Unbreak Now! priority.Aug 30 2015, 1:30 PM

actionables:

  • start-nfs doesn't seem to have launched or checked sync-exports so bindmounts weren't present when nfs was first started
  • I couldn't find an equivalent stop-nfs
  • the degraded raid10 arrays didn't get assembled upon reboot, manual assembly worked
  • possibly related, there is a mixture of "HOMEHOST" setting in the arrays metadata
root@labstore1002:~# mdadm --detail /dev/md122 |grep Name
           Name : labstore1002:slice15  (local to host labstore1002)
root@labstore1002:~# mdadm --detail /dev/md123 |grep Name
           Name : labstore1001:slice51
root@labstore1002:~# mdadm --detail /dev/md124 |grep Name
           Name : labstore:shelf23
root@labstore1002:~# mdadm --detail /dev/md125 |grep Name
           Name : labstore:shelf32
root@labstore1002:~# mdadm --detail /dev/md126 |grep Name
           Name : labstore:shelf44
root@labstore1002:~# mdadm --detail /dev/md127 |grep Name
           Name : labstore1002:backup  (local to host labstore1002)
root@labstore1002:~# mdadm --detail /dev/md3 |grep Name
           Name : labstore1001:3
yuvipanda claimed this task.

Sorted now.