Page MenuHomePhabricator

Toolforge: rebuild the new k8s toolsbeta deployment and write final docs
Closed, ResolvedPublic

Description

We know now how to do this. Many changes happened in the puppet tree and in the network design that suggest we should completely rebuild the k8s cluster in toolsbeta before doing the final build in the tools project.

While at it, write some more "officinal" admin docs beyond what we have in https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/k8s-buster-migration-notes
The new admin docs will be at https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Deploying_k8s

Event Timeline

aborrero triaged this task as Medium priority.Oct 21 2019, 2:53 PM
aborrero created this task.
aborrero moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

Mentioned in SAL (#wikimedia-cloud) [2019-10-21T14:58:57Z] <arturo> deleting all toolsbeta-test-* VMs (master, worker, etcd, lb) T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-21T15:07:07Z] <arturo> create 3 VMs toolsbeta-test-k8s-etcd-{1,2,3} T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-21T15:13:26Z] <arturo> refresh config in prefix puppet toolsbeta-test-k8s-etcd to account for new servers T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T11:54:16Z] <arturo> created puppet prefix toolsbeta-test-k8s-haproxy and delete toolsbeta-test-k8s-lb T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T11:57:27Z] <arturo> created 2 new VMS toolsbeta-test-k8s-haproxy-{1,2} T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T12:14:08Z] <arturo> delete FQDN toolsbeta-k8s-master.toolsbeta.wmflabs.org T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T12:15:03Z] <arturo> refresh IP addr of FQDN k8s.toolsbeta.eqiad1.wikimedia.cloud T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T12:26:08Z] <arturo> created 3 VMs toolsbeta-test-k8s-control-{1,2,3} T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T12:27:42Z] <arturo> refreshed puppet prefix toolsbeta-test-k8s-control with latest info T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T15:30:33Z] <arturo> created puppet prefix toolsbeta-test-k8s-control and delete toolsbeta-test-k8s-master T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T15:48:41Z] <arturo> point DNS record k8s.toolsbeta.eqiad1.wikimedia.cloud to the first controller node for the bootstrap T236074

Mentioned in SAL (#wikimedia-cloud) [2019-10-22T17:43:15Z] <arturo> re-create VM toolsbeta-test-k8s-control-1 T236074

Change 545532 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: haproxy: don't use haproxy base module

https://gerrit.wikimedia.org/r/545532

Mentioned in SAL (#wikimedia-cloud) [2019-10-23T11:10:21Z] <arturo> re-create VM toolsbeta-test-k8s-haproxy-2 to test https://gerrit.wikimedia.org/r/545532 (T236074)

Change 545532 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: haproxy: don't use haproxy base module

https://gerrit.wikimedia.org/r/545532

Mentioned in SAL (#wikimedia-cloud) [2019-10-23T11:19:59Z] <arturo> re-create VM toolsbeta-test-k8s-haproxy-1 to use new puppet profile (T236074)

Change 545541 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: haproxy: tell the service to load all config files

https://gerrit.wikimedia.org/r/545541

Change 545541 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: haproxy: tell the service to load all config files

https://gerrit.wikimedia.org/r/545541

Mentioned in SAL (#wikimedia-cloud) [2019-10-23T11:56:37Z] <arturo> point FQDN k8s.toolsbeta.eqiad1.wikimedia.cloud to toolsbeta-test-k8s-haproxy-1 (T236074)

Mentioned in SAL (#wikimedia-cloud) [2019-10-23T12:04:07Z] <arturo> created 2 new VMs toolsbeta-test-k8s-worker-[1,2] T236074

Change 545587 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: ingress: fix serviceaccount names

https://gerrit.wikimedia.org/r/545587

Change 545587 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: ingress: fix serviceaccount names

https://gerrit.wikimedia.org/r/545587

Change 546145 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: ingress: fix binding reference names

https://gerrit.wikimedia.org/r/546145

Change 546145 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: ingress: fix binding reference names

https://gerrit.wikimedia.org/r/546145

Change 546439 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: delete kubeadm keyword from things not related to kubeadm

https://gerrit.wikimedia.org/r/546439

Change 546439 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: delete kubeadm keyword from things not related to kubeadm

https://gerrit.wikimedia.org/r/546439

I consider this done now!