Page MenuHomePhabricator

[infra] Upgrade Toolforge K8s etcd nodes to Bullseye
Closed, ResolvedPublic

Event Timeline

taavi triaged this task as Medium priority.Jan 24 2024, 2:02 PM
dcaro renamed this task from Upgrade Toolforge K8s etcd nodes to Bookworm to [k8s] Upgrade Toolforge K8s etcd nodes to Bookworm.Mar 5 2024, 4:12 PM
dcaro renamed this task from [k8s] Upgrade Toolforge K8s etcd nodes to Bookworm to [infra] Upgrade Toolforge K8s etcd nodes to Bookworm.Mar 5 2024, 5:15 PM
taavi renamed this task from [infra] Upgrade Toolforge K8s etcd nodes to Bookworm to [infra] Upgrade Toolforge K8s etcd nodes to Bullseye.Mar 28 2024, 2:29 PM

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T14:42:46Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_hiera (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T14:42:51Z] <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_hiera (exit_code=0) (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T14:43:14Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_cluster (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T14:46:21Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_hiera (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T14:46:24Z] <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_hiera (exit_code=0) (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T14:48:10Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.etcd.add_node_to_cluster (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T15:19:12Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.etcd.remove_node_from_hiera (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T15:19:18Z] <andrew@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.etcd.remove_node_from_hiera (exit_code=0) (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T15:29:11Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T15:35:55Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T15:49:01Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T15:53:44Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T16:09:10Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T16:24:16Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-03-28T17:42:00Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Change #1015363 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] etcd:v3: Don't rely on 'etcd' group or user before installing etcd package

https://gerrit.wikimedia.org/r/1015363

Change #1015363 merged by Andrew Bogott:

[operations/puppet@production] etcd:v3: Don't rely on 'etcd' group or user before installing etcd package

https://gerrit.wikimedia.org/r/1015363

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-01T14:30:35Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-01T14:59:59Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

@Andrew the last patch was not enough it seems, the certificate files need to be sorted out too:

root@toolsbeta-test-k8s-etcd-23:~# run-puppet-agent
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud
Info: Applying configuration version '(330baf3915) Majavah - cloud puppetservers: remove hooks preventing local commit/merge/rebase'
Error: Could not find user etcd
Info: Unknown failure using insync_values? on type: File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.pem] / property: owner to compare values ["etcd"] and 0
Error: /Stage[main]/Profile::Wmcs::Kubeadm::Etcd/File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.pem]/owner: change from 'root' to 'etcd' failed: Could not find user etcd
Error: Could not find group etcd
Info: Unknown failure using insync_values? on type: File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.pem] / property: group to compare values ["etcd"] and 0
Error: /Stage[main]/Profile::Wmcs::Kubeadm::Etcd/File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.pem]/group: change from 'root' to 'etcd' failed: Could not find group etcd
Error: Could not find user etcd
Info: Unknown failure using insync_values? on type: File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.priv] / property: owner to compare values ["etcd"] and 0
Error: /Stage[main]/Profile::Wmcs::Kubeadm::Etcd/File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.priv]/owner: change from 'root' to 'etcd' failed: Could not find user etcd
Error: Could not find group etcd
Info: Unknown failure using insync_values? on type: File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.priv] / property: group to compare values ["etcd"] and 0
Error: /Stage[main]/Profile::Wmcs::Kubeadm::Etcd/File[/etc/etcd/ssl/toolsbeta-test-k8s-etcd-23.toolsbeta.eqiad1.wikimedia.cloud.priv]/group: change from 'root' to 'etcd' failed: Could not find group etcd
Error: Could not find user etcd
Info: Unknown failure using insync_values? on type: File[/etc/etcd/ssl/ca.pem] / property: owner to compare values ["etcd"] and 0
Error: /Stage[main]/Profile::Wmcs::Kubeadm::Etcd/File[/etc/etcd/ssl/ca.pem]/owner: change from 'root' to 'etcd' failed: Could not find user etcd
Error: Could not find group etcd
Info: Unknown failure using insync_values? on type: File[/etc/etcd/ssl/ca.pem] / property: group to compare values ["etcd"] and 0
Error: /Stage[main]/Profile::Wmcs::Kubeadm::Etcd/File[/etc/etcd/ssl/ca.pem]/group: change from 'root' to 'etcd' failed: Could not find group etcd

Though those have an explicit requirement to be before the package, so might need some testing:

modules/profile/manifests/wmcs/kubeadm/etcd.pp

 36     file { $etcd_cert_pub:                                                     
 37         ensure => present,                                                     
 38         source => "file://${puppet_cert_pub}",                                 
 39         owner  => 'etcd',                                                      
 40         group  => 'etcd',                                                      
 41         before => [Service['etcd'], Package['etcd-server']],                   
 42     }                                                                          
 43                                                                                
 44     file { $etcd_cert_priv:                                                    
 45         ensure    => present,                                                  
 46         source    => "file://${puppet_cert_priv}",                             
 47         owner     => 'etcd',                                                   
 48         group     => 'etcd',                                                   
 49         mode      => '0640',                                                   
 50         show_diff => false,                                                    
 51         before    => [Service['etcd'], Package['etcd-server']],                                                                                                                                                                                                                                                                                                                           
 52     }

Yeah, there's a lot of self-contradictory explicit ordering in this code. I wish I knew if it was put there to solve anything.

Change #1016346 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] profile::wmcs::kubeadm::etcd: install etcd package before referencing uid

https://gerrit.wikimedia.org/r/1016346

Change #1016346 merged by Andrew Bogott:

[operations/puppet@production] profile::wmcs::kubeadm::etcd: install etcd package before referencing uid

https://gerrit.wikimedia.org/r/1016346

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-02T14:33:57Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-02T15:06:42Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-02T16:22:28Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-02T17:25:42Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-02T17:53:13Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-02T18:20:07Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:12:08Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:19:54Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:29:14Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:35:10Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:37:37Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:43:42Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:47:04Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:49:33Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T13:54:31Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T14:16:58Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T14:32:14Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-08T14:49:57Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-09T18:09:55Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-09T18:39:26Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-09T19:28:52Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-09T21:04:23Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-09T21:09:11Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Mentioned in SAL (#wikimedia-cloud-feed) [2024-04-09T22:01:06Z] <andrew@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_etcd_node (T349207)

Andrew claimed this task.

etcd nodes are now all bullseye. Moving them to bookworm will require more work; the bookworm nodes seem to cluster properly but they return 404s to normal api requests like 'member list'

etcd nodes are now all bullseye. Moving them to bookworm will require more work; the bookworm nodes seem to cluster properly but they return 404s to normal api requests like 'member list'

Hmm. Sounds like ETCDCTL_API=3 become the default on the etcd version on bookworm.