Page MenuHomePhabricator

create codfw-equivalent of bromine, make webserver_misc_static active/active in misc varnish
Closed, ResolvedPublic

Description

  • create a ganeti VM in codfw that is equivalent to bromine.eqiad.wmnet (subtask with VM request and install here)
  • use stretch and apply "misc_static"-webserver puppet role, confirm it works on stretch
  • switch backend in Varnish over to codfw temp. confirm things still work
  • reinstall bromine (currently jessie) with stretch
  • make service active/active in Varnish to serve from both DCs

This means the following sites will also be served from codfw and can still be up if we fail-over from eqiad or if the eqiad VM breaks and vice-versa. Not a single point of failure anymore for:

  1. https://annual.wikimedia.org , https://15.wikipedia.org
  2. https://static-bugzilla.wikimedia.org
  3. https://transparency.wikimedia.org , https://transparency-private.wikimedia.org
  4. https://wikiba.se (planned)
  5. https://research.wikimedia.org (new)
  6. https://design.wikimedia.org (coming soon)

Event Timeline

Dzahn created this task.Feb 24 2018, 2:39 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 24 2018, 2:39 AM

Change 420082 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] introduce vega.codfw.wmnet (bromine equivalent)

https://gerrit.wikimedia.org/r/420082

Change 420082 merged by Dzahn:
[operations/dns@master] introduce vega.codfw.wmnet (bromine equivalent)

https://gerrit.wikimedia.org/r/420082

Dzahn renamed this task from create codfw-equivalent of bromine and make webserver_misc_static active/active in misc varnish to create codfw-equivalent of bromine, make webserver_misc_static active/active in misc varnish.Mar 16 2018, 10:59 PM
Dzahn triaged this task as Normal priority.
Dzahn updated the task description. (Show Details)
Dzahn updated the task description. (Show Details)

Change 420132 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP/netboot: add vega.codfw.wmnet

https://gerrit.wikimedia.org/r/420132

Change 420132 merged by Dzahn:
[operations/puppet@production] DHCP/netboot: add vega.codfw.wmnet

https://gerrit.wikimedia.org/r/420132

Mentioned in SAL (#wikimedia-operations) [2018-03-16T23:32:14Z] <mutante> signing puppet cert for vega.codfw.wmnet, initial puppet run after fresh stretch install (T188163)

Change 420134 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site/webserver_misc_static: add vega as codfw node

https://gerrit.wikimedia.org/r/420134

Change 420134 merged by Dzahn:
[operations/puppet@production] site/webserver_misc_static: add vega as codfw node

https://gerrit.wikimedia.org/r/420134

Dzahn updated the task description. (Show Details)Mar 16 2018, 11:49 PM
Dzahn updated the task description. (Show Details)

Change 420137 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] misc:varnish: rename bromine director, clean up unused design director

https://gerrit.wikimedia.org/r/420137

Change 420137 merged by Dzahn:
[operations/puppet@production] misc:varnish: rename bromine director, clean up unused design director

https://gerrit.wikimedia.org/r/420137

Mentioned in SAL (#wikimedia-operations) [2018-03-17T00:13:19Z] <mutante> running puppet on all cache::misc to rename director bromine to webserver_misc_static (T188163)

Krinkle updated the task description. (Show Details)Mar 17 2018, 12:35 AM
Krinkle added a project: Availability.
Krinkle moved this task from Backlog to Doing on the Availability board.

Change 420142 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] cache::misc: switch webserver_misc_static to codfw backend

https://gerrit.wikimedia.org/r/420142

Change 423059 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] add IPv6 records for vega.codfw.wmnet

https://gerrit.wikimedia.org/r/423059

Change 423059 merged by Dzahn:
[operations/dns@master] add IPv6 records for vega.codfw.wmnet

https://gerrit.wikimedia.org/r/423059

Change 423079 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/dns@master] add IPv6 records for bromine.eqiad.wmnet

https://gerrit.wikimedia.org/r/423079

Change 423079 merged by Dzahn:
[operations/dns@master] add IPv6 records for bromine.eqiad.wmnet

https://gerrit.wikimedia.org/r/423079

Change 423080 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] cache::misc: add codfw backend for webserver_misc_static

https://gerrit.wikimedia.org/r/423080

Change 423080 merged by Dzahn:
[operations/puppet@production] cache::misc: add codfw backend for webserver_misc_static

https://gerrit.wikimedia.org/r/423080

Dzahn updated the task description. (Show Details)Apr 2 2018, 10:39 PM
  • tested codfw backend with apache-fast-test
  • added codfw backend to make service active/active
  • have not reinstalled bromine yet, but to switch to codfw i would have had to add both first anyways.. either way

Change 424657 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] rsync bugzilla-static content from bromine to vega

https://gerrit.wikimedia.org/r/424657

Change 424657 merged by Dzahn:
[operations/puppet@production] rsync bugzilla-static content from bromine to vega

https://gerrit.wikimedia.org/r/424657

Change 424716 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP: upgrade bromine from jessie to stretch

https://gerrit.wikimedia.org/r/424716

Change 424716 merged by Dzahn:
[operations/puppet@production] DHCP: upgrade bromine from jessie to stretch

https://gerrit.wikimedia.org/r/424716

Mentioned in SAL (#wikimedia-operations) [2018-04-07T00:14:55Z] <mutante> bromine - scheduled downtime, reboot for reinstall, upgrade to stretch, misc_static_services switched to codfw (T188163)

Change 424727 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] static_bugzilla: reverse rsync direction after bromine reinstall

https://gerrit.wikimedia.org/r/424727

Change 424727 merged by Dzahn:
[operations/puppet@production] static_bugzilla: reverse rsync direction after bromine reinstall

https://gerrit.wikimedia.org/r/424727

Change 424731 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] Revert "misc_static_sites: temp disable bromine backend for reinstall"

https://gerrit.wikimedia.org/r/424731

Change 424731 merged by Dzahn:
[operations/puppet@production] Revert "misc_static_sites: temp disable bromine backend for reinstall"

https://gerrit.wikimedia.org/r/424731

Dzahn closed this task as Resolved.Apr 7 2018, 12:47 AM
Dzahn updated the task description. (Show Details)

All done!

  • we are active-active
  • both eqiad and codfw are on stretch

Change 420142 abandoned by Dzahn:
cache::misc: switch webserver_misc_static to codfw backend

Reason:
duplicate. already done elsewhere. ticket is resolved

https://gerrit.wikimedia.org/r/420142