### Overview
We have two sets of cloudelastic nodes that were provisioned at different times
`cloudelastic100[1-4]` is the oldest group
`cloudelastic100[5-6]` were added more recently
In this case our pre-existing `elasticsearch-readahead` udev rule is failing on `cloudelastic100[5-6]`. This may be due to the different way the two are partitioned. See the final section, `Additional context on disk layout`, for more info.
### Proposed Solution
Figure out how to change the udev rule to make it more robust for different device configurations, or (ugh) change the partitioning on the newer instances to match the old instances, if it turns out that the partitioning is the problem.
It looks like https://github.com/wikimedia/puppet/blob/dfa5f9795ec7cc681fa7bbd5c90da7b3dd4823aa/modules/profile/manifests/elasticsearch/cirrus.pp#L16 feeds into https://github.com/wikimedia/puppet/blob/dfa5f9795ec7cc681fa7bbd5c90da7b3dd4823aa/modules/profile/manifests/elasticsearch/cirrus.pp#L119-L121, Observe:so we should just need different hiera values for `cloudelastic100[1-4]` vs `cloudelastic100[5-6]`.
### Additional context on disk layout
Putting this at the bottom of the ticket writeup since it takes up a lot of space.
```
user@Ryans-MacBook-Pro-3 ~/wmf/puppet [production]% ssh cloudelastic1005.wikimedia.org
Linux cloudelastic1005 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1 (2020-01-20) x86_64
Debian GNU/Linux 9.13 (stretch)
cloudelastic1005 is a elasticsearch cloud elastic cirrus (elasticsearch::cloudelastic)
The last Puppet run was at Fri Oct 16 01:01:14 UTC 2020 (13 minutes ago).
Last puppet commit: (dfa5f9795e) Dzahn - base::environment: remove lint-ignore that ignores nothing
Debian GNU/Linux 9 auto-installed on Tue May 5 14:54:52 UTC 2020.
Last login: Thu Oct 8 23:30:58 2020 from 2620:0:863:1:198:35:26:6
ryankemper@cloudelastic1005:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
├─sda1 8:1 0 285M 0 part
└─sda2 8:2 0 1.8T 0 part
└─md0 9:0 0 5.2T 0 raid10
├─vg0-swap 253:0 0 976M 0 lvm [SWAP]
├─vg0-root 253:1 0 74.5G 0 lvm /
└─vg0-srv 253:2 0 5.2T 0 lvm /srv
sdb 8:16 0 1.8T 0 disk
├─sdb1 8:17 0 285M 0 part
└─sdb2 8:18 0 1.8T 0 part
└─md0 9:0 0 5.2T 0 raid10
├─vg0-swap 253:0 0 976M 0 lvm [SWAP]
├─vg0-root 253:1 0 74.5G 0 lvm /
└─vg0-srv 253:2 0 5.2T 0 lvm /srv
sdc 8:32 0 1.8T 0 disk
├─sdc1 8:33 0 285M 0 part
└─sdc2 8:34 0 1.8T 0 part
└─md0 9:0 0 5.2T 0 raid10
├─vg0-swap 253:0 0 976M 0 lvm [SWAP]
├─vg0-root 253:1 0 74.5G 0 lvm /
└─vg0-srv 253:2 0 5.2T 0 lvm /srv
sdd 8:48 0 1.8T 0 disk
├─sdd1 8:49 0 285M 0 part
└─sdd2 8:50 0 1.8T 0 part
└─md0 9:0 0 5.2T 0 raid10
├─vg0-swap 253:0 0 976M 0 lvm [SWAP]
├─vg0-root 253:1 0 74.5G 0 lvm /
└─vg0-srv 253:2 0 5.2T 0 lvm /srv
sde 8:64 0 1.8T 0 disk
├─sde1 8:65 0 285M 0 part
└─sde2 8:66 0 1.8T 0 part
└─md0 9:0 0 5.2T 0 raid10
├─vg0-swap 253:0 0 976M 0 lvm [SWAP]
├─vg0-root 253:1 0 74.5G 0 lvm /
└─vg0-srv 253:2 0 5.2T 0 lvm /srv
sdf 8:80 0 1.8T 0 disk
├─sdf1 8:81 0 285M 0 part
└─sdf2 8:82 0 1.8T 0 part
└─md0 9:0 0 5.2T 0 raid10
├─vg0-swap 253:0 0 976M 0 lvm [SWAP]
├─vg0-root 253:1 0 74.5G 0 lvm /
└─vg0-srv 253:2 0 5.2T 0 lvm /srv
ryankemper@cloudelastic1005:~$ exit
logout
Connection to cloudelastic1005.wikimedia.org closed.
user@Ryans-MacBook-Pro-3 ~/wmf/puppet [production]% ssh cloudelastic1004.wikimedia.org
Linux cloudelastic1004 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u3 (2019-06-16) x86_64
Debian GNU/Linux 9.13 (stretch)
cloudelastic1004 is a elasticsearch cloud elastic cirrus (elasticsearch::cloudelastic)
The last Puppet run was at Fri Oct 16 01:03:03 UTC 2020 (12 minutes ago).
Last puppet commit: (dfa5f9795e) Dzahn - base::environment: remove lint-ignore that ignores nothing
Debian GNU/Linux 9 auto-installed on Fri Feb 8 16:29:47 UTC 2019.
Last login: Fri Oct 16 01:14:15 2020 from 2620:0:863:1:198:35:26:6
ryankemper@cloudelastic1004:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 46.6G 0 part
│ └─md0 9:0 0 139.6G 0 raid10 /
└─sda3 8:3 0 1.7T 0 part
└─md1 9:1 0 5.1T 0 raid10
└─cloudelastic1004--vg-data 253:0 0 5.1T 0 lvm /srv
sdb 8:16 0 1.8T 0 disk
├─sdb1 8:17 0 1M 0 part
├─sdb2 8:18 0 46.6G 0 part
│ └─md0 9:0 0 139.6G 0 raid10 /
└─sdb3 8:19 0 1.7T 0 part
└─md1 9:1 0 5.1T 0 raid10
└─cloudelastic1004--vg-data 253:0 0 5.1T 0 lvm /srv
sdc 8:32 0 1.8T 0 disk
├─sdc1 8:33 0 1M 0 part
├─sdc2 8:34 0 46.6G 0 part
│ └─md0 9:0 0 139.6G 0 raid10 /
└─sdc3 8:35 0 1.7T 0 part
└─md1 9:1 0 5.1T 0 raid10
└─cloudelastic1004--vg-data 253:0 0 5.1T 0 lvm /srv
sdd 8:48 0 1.8T 0 disk
├─sdd1 8:49 0 1M 0 part
├─sdd2 8:50 0 46.6G 0 part
│ └─md0 9:0 0 139.6G 0 raid10 /
└─sdd3 8:51 0 1.7T 0 part
└─md1 9:1 0 5.1T 0 raid10
└─cloudelastic1004--vg-data 253:0 0 5.1T 0 lvm /srv
sde 8:64 0 1.8T 0 disk
├─sde1 8:65 0 1M 0 part
├─sde2 8:66 0 46.6G 0 part
│ └─md0 9:0 0 139.6G 0 raid10 /
└─sde3 8:67 0 1.7T 0 part
└─md1 9:1 0 5.1T 0 raid10
└─cloudelastic1004--vg-data 253:0 0 5.1T 0 lvm /srv
sdf 8:80 0 1.8T 0 disk
├─sdf1 8:81 0 1M 0 part
├─sdf2 8:82 0 46.6G 0 part
│ └─md0 9:0 0 139.6G 0 raid10 /
└─sdf3 8:83 0 1.7T 0 part
└─md1 9:1 0 5.1T 0 raid10
└─cloudelastic1004--vg-data 253:0 0 5.1T 0 lvm /srv
```
Note the disk configurations of the two instance groups are way different.
```
ryankemper@cloudelastic1003:~$ sudo lvdisplay
--- Logical volume ---
LV Path /dev/cloudelastic1003-vg/data
LV Name data
VG Name cloudelastic1003-vg
LV UUID eOF5d7-D7wn-yMWZ-EJx0-V9Yg-udiq-ifoLXh
LV Write Access read/write
LV Creation host, time cloudelastic1003, 2018-08-02 21:36:10 +0000
LV Status available
# open 1
LV Size 5.10 TiB
Current LE 1337704
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 32
Block device 253:0
ryankemper@cloudelastic1003:~$ exit
logout
Connection to cloudelastic1003.wikimedia.org closed.
user@Ryans-MacBook-Pro-3 ~/wmf/puppet [production]% ssh cloudelastic1005.wikimedia.org
Linux cloudelastic1005 4.9.0-12-amd64 #1 SMP Debian 4.9.210-1 (2020-01-20) x86_64
Debian GNU/Linux 9.13 (stretch)
cloudelastic1005 is a elasticsearch cloud elastic cirrus (elasticsearch::cloudelastic)
The last Puppet run was at Fri Oct 16 01:01:14 UTC 2020 (16 minutes ago).
Last puppet commit: (dfa5f9795e) Dzahn - base::environment: remove lint-ignore that ignores nothing
Debian GNU/Linux 9 auto-installed on Tue May 5 14:54:52 UTC 2020.
Last login: Fri Oct 16 01:14:22 2020 from 2620:0:863:1:198:35:26:6
ryankemper@cloudelastic1005:~$ sudo lvdisplay
--- Logical volume ---
LV Path /dev/vg0/swap
LV Name swap
VG Name vg0
LV UUID NPf84w-vdtD-vAm7-H1gV-mpfK-Hpg2-1HOAVP
LV Write Access read/write
LV Creation host, time cloudelastic1005, 2020-05-05 14:50:16 +0000
LV Status available
# open 2
LV Size 976.00 MiB
Current LE 244
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 32
Block device 253:0
--- Logical volume ---
LV Path /dev/vg0/root
LV Name root
VG Name vg0
LV UUID VIZSuW-AtjS-G0sQ-j8rm-ptJo-8DlN-34ogQe
LV Write Access read/write
LV Creation host, time cloudelastic1005, 2020-05-05 14:50:16 +0000
LV Status available
# open 1
LV Size 74.50 GiB
Current LE 19073
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 6144
Block device 253:1
--- Logical volume ---
LV Path /dev/vg0/srv
LV Name srv
VG Name vg0
LV UUID KjyN08-wQ0A-ZTl2-veuS-bcsa-swOy-O8wrRl
LV Write Access read/write
LV Creation host, time cloudelastic1005, 2020-05-05 14:50:16 +0000
LV Status available
# open 1
LV Size 5.16 TiB
Current LE 1353937
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 6144
Block device 253:2
```
Figure out how to change the udev rule to make it more robust for different device configurations, or (ugh) change the partitioning on the newer instances to match the old instances, if it turns out that the partitioning is the problem.