Page MenuHomePhabricator

recreate eqiad cluster state from code stored in deployment-charts with helmfile [MIGHT CAUSE DOWNTIME]
Open, HighPublic

Description

if we want the cluster to be managed via code and helmfile we will need to recreate namespaces and deployments, the following services and namespaces will be affected.

Current plan is to do it all at once, deleting current namespaces and applying helmfile cluster wide. The list of following checkboxes are for double checking each service after recreation.

  • blubberoid
  • citoid
  • cxserver
  • eventgate-analytics
  • eventgate-main
  • mathoid
  • sessionstore
  • termbox
  • zotero

Event Timeline

fsero created this task.Wed, Jul 24, 8:21 AM
fsero updated the task description. (Show Details)Wed, Jul 24, 8:36 AM
fsero triaged this task as High priority.Thu, Jul 25, 9:25 AM

Change 529926 had a related patch set uploaded (by Fsero; owner: Fsero):
[operations/deployment-charts@master] Introducing podsecpolicies,calico and coredns in eqiad

https://gerrit.wikimedia.org/r/529926

Change 529926 merged by Fsero:
[operations/deployment-charts@master] Introducing podsecpolicies,calico and coredns in eqiad

https://gerrit.wikimedia.org/r/529926

Change 529927 had a related patch set uploaded (by Fsero; owner: Fsero):
[operations/puppet@production] caching,k8s: depool eqiad services exposed to cache for cluster recreation.

https://gerrit.wikimedia.org/r/529927

Mentioned in SAL (#wikimedia-operations) [2019-08-13T10:10:21Z] <fsero> creating tiller in kube-system for helmfile T228836

Mentioned in SAL (#wikimedia-operations) [2019-08-13T10:10:45Z] <fsero> initialize_cluster.sh kube-system kubemaster.svc.eqiad.wmnet 6443 - T228836

Change 529927 merged by Fsero:
[operations/puppet@production] caching,k8s: depool eqiad services exposed to cache for cluster recreation.

https://gerrit.wikimedia.org/r/529927

Mentioned in SAL (#wikimedia-operations) [2019-08-13T11:13:43Z] <fsero> recreating termbox namespace - T228836

Mentioned in SAL (#wikimedia-operations) [2019-08-13T11:21:48Z] <fsero> recreating citoid eventgate-analytics eventgate-main mathoid namespace - T228836

Mentioned in SAL (#wikimedia-operations) [2019-08-13T11:44:21Z] <fsero> recreating cxserver blubber and sessionstore namespace - T228836