Page MenuHomePhabricator

Deploy initial ATS test clusters in core DCs
Closed, ResolvedPublic

Description

This is a Q1 goal:

  • Package and puppetize a deployable basic ATS config
  • Deploy 4-node test clusters running ATS in eqiad + codfw
  • Deploy code/configuration for basic application-layer request routing from these clusters (keeping in mind this will be a single shared backend cluster for all of the currently-separated Varnish clusters)

Not in scope:

  • Full request mangling matching current VCL
  • Solving scalability / design issues related to chashing and inter-node/cluster communication, etc
  • Purging

Details

ProjectBranchLines +/-Subject
operations/puppetproduction+2 -2
operations/puppetproduction+7 -6
operations/puppetproduction+234 -3
operations/debs/trafficservermaster+37 -0
operations/puppetproduction+4 -0
operations/software/varnish/vhtcpddebian+6 -0
operations/software/varnish/vhtcpdmaster+7 -3
operations/puppetproduction+2 -6
operations/puppetproduction+2 -2
operations/puppetproduction+31 -0
operations/puppetproduction+49 -2
operations/puppetproduction+1 -1
operations/puppetproduction+19 -7
operations/puppetproduction+4 -1
operations/puppetproduction+203 -1
integration/configmaster+1 -1
integration/configmaster+7 -1
integration/configmaster+8 -1
operations/puppetproduction+5 -9
operations/puppetproduction+34 -3
operations/puppetproduction+2 -0
operations/puppetproduction+0 -0
operations/puppetproduction+3 -3
operations/puppetproduction+7 -5
operations/puppetproduction+14 -1
Show related patches Customize query in gerrit

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
herron triaged this task as Medium priority.Jul 18 2018, 6:39 PM

Change 447074 had a related patch set uploaded (by Ema; owner: Ema):
[operations/debs/trafficserver@master] Initial WMF packaging

https://gerrit.wikimedia.org/r/447074

Change 447077 had a related patch set uploaded (by Ema; owner: Ema):
[integration/config@master] Enable debian-glue for trafficserver

https://gerrit.wikimedia.org/r/447077

Change 451593 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] Deploy ATS backend on cp2003

https://gerrit.wikimedia.org/r/451593

Change 451593 merged by Ema:
[operations/puppet@production] Deploy ATS backend on cp2003

https://gerrit.wikimedia.org/r/451593

ema updated the task description. (Show Details)

Script wmf-auto-reimage was launched by ema on neodymium.eqiad.wmnet for hosts:

['cp2003.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201808090902_ema_26968.log.

Completed auto-reimage of hosts:

['cp2003.codfw.wmnet']

and were ALL successful.

Change 451620 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] Deploy ATS backends on remaining codfw test hosts

https://gerrit.wikimedia.org/r/451620

Change 451620 merged by Ema:
[operations/puppet@production] Deploy ATS backends on remaining codfw test hosts

https://gerrit.wikimedia.org/r/451620

Script wmf-auto-reimage was launched by ema on sarin.codfw.wmnet for hosts:

['cp2009.codfw.wmnet', 'cp2015.codfw.wmnet', 'cp2021.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201808091258_ema_5916.log.

Change 451623 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] Deploy ATS backends on eqiad test hosts

https://gerrit.wikimedia.org/r/451623

Change 451623 merged by Ema:
[operations/puppet@production] Deploy ATS backends on eqiad test hosts

https://gerrit.wikimedia.org/r/451623

Script wmf-auto-reimage was launched by ema on neodymium.eqiad.wmnet for hosts:

['cp1071.eqiad.wmnet', 'cp1072.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201808091308_ema_17037.log.

Change 451626 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] Move trafficserver::backend hiera settings from codfw to common

https://gerrit.wikimedia.org/r/451626

Change 451626 merged by Ema:
[operations/puppet@production] Move trafficserver::backend hiera settings from codfw to common

https://gerrit.wikimedia.org/r/451626

Completed auto-reimage of hosts:

['cp2015.codfw.wmnet', 'cp2009.codfw.wmnet']

Of which those FAILED:

['cp2015.codfw.wmnet', 'cp2009.codfw.wmnet']

Completed auto-reimage of hosts:

['cp1071.eqiad.wmnet', 'cp1072.eqiad.wmnet']

and were ALL successful.

Change 451628 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS clusters: set IPv6 addresses

https://gerrit.wikimedia.org/r/451628

Change 451628 merged by Ema:
[operations/puppet@production] ATS clusters: set IPv6 addresses

https://gerrit.wikimedia.org/r/451628

ema updated the task description. (Show Details)

Script wmf-auto-reimage was launched by ema on neodymium.eqiad.wmnet for hosts:

['cp1073.eqiad.wmnet', 'cp1074.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201808091412_ema_669.log.

Completed auto-reimage of hosts:

['cp1073.eqiad.wmnet', 'cp1074.eqiad.wmnet']

Of which those FAILED:

['cp1073.eqiad.wmnet', 'cp1074.eqiad.wmnet']

Change 451654 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: outbound TLS connections to the appservers

https://gerrit.wikimedia.org/r/451654

Change 451654 merged by Ema:
[operations/puppet@production] ATS: allow to specify outbound TLS connection settings

https://gerrit.wikimedia.org/r/451654

Change 451838 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: add Lua scripting support

https://gerrit.wikimedia.org/r/451838

Change 452351 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: move verify_config to ExecReload

https://gerrit.wikimedia.org/r/452351

Change 452351 merged by Ema:
[operations/puppet@production] ATS: move verify_config to ExecReload

https://gerrit.wikimedia.org/r/452351

Change 452612 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] contint: add support for testing ts-lua plugins

https://gerrit.wikimedia.org/r/452612

Change 452634 had a related patch set uploaded (by Ema; owner: Ema):
[integration/config@master] operations-puppet: install Busted for lua unit testing

https://gerrit.wikimedia.org/r/452634

Change 452634 merged by jenkins-bot:
[integration/config@master] operations-puppet: install Busted for lua unit testing

https://gerrit.wikimedia.org/r/452634

Change 452692 had a related patch set uploaded (by Ema; owner: Ema):
[integration/config@master] Use new operations-puppet image with Lua support

https://gerrit.wikimedia.org/r/452692

2018-08-14 16:01:08,650 [docker-pkg-build] INFO - Generated dockerfile for docker-registry.discovery.wmnet/releng/operations-puppet:0.3.3:
FROM docker-registry.discovery.wmnet/releng/ci-jessie:0.3.0

ENV LANG='en_US.UTF-8' LANGUAGE='en_US:en' LC_ALL='en_US.UTF-8'

ENV PUPPET_DIR='/srv/workspace/puppet'



RUN echo 'Acquire::http::Proxy "http://webproxy.eqiad.wmnet:8080";' > /etc/apt/apt.conf.d/80_proxy \
    && apt-get update  \
    && DEBIAN_FRONTEND=noninteractive \
    apt-get install  --yes build-essential bundler python-dev     python-pip rubygems-integration rake ruby ruby-dev ca-certificates libmysqlclient-dev     mtail isc-dhcp-server luarocks --no-install-recommends \
    && rm -f /etc/apt/apt.conf.d/80_proxy \
    && apt-get clean && rm -rf /var/lib/apt/lists/*  \
    && pip install pip==8.1.2 \
    && pip install tox==1.9.2 setuptools \
    && luarocks install busted \
    && install --owner=nobody --group=nogroup --directory /srv/workspace

USER nobody
RUN git clone https://gerrit.wikimedia.org/r/operations/puppet "${PUPPET_DIR}" \
    && cd "${PUPPET_DIR}" \
    && git tag -f 'docker-head' && git gc --prune=now \
    && TOX_TESTENV_PASSENV=PY_COLORS PY_COLORS=1 tox -v --notest \
    && bundle install --clean --path="${PUPPET_DIR}/.bundle"

WORKDIR /srv/workspace
ENTRYPOINT /bin/bash /run.sh

COPY bundle-config "${PUPPET_DIR}/.bundle/bundle-config"
COPY run.sh /run.sh (image.py:127)
2018-08-14 16:04:09,598 [docker-pkg-build] ERROR - Building image docker-registry.discovery.wmnet/releng/operations-puppet:0.3.3 failed - check your Dockerfile: The command '/bin/sh -c echo 'Acquire::http::Proxy "http://webproxy.eqiad.wmnet:8080";' > /etc/apt/apt.conf.d/80_proxy     && apt-get update      && DEBIAN_FRONTEND=noninteractive     apt-get install  --yes build-essential bundler python-dev     python-pip rubygems-integration rake ruby ruby-dev ca-certificates libmysqlclient-dev     mtail isc-dhcp-server luarocks --no-install-recommends     && rm -f /etc/apt/apt.conf.d/80_proxy     && apt-get clean && rm -rf /var/lib/apt/lists/*      && pip install pip==8.1.2     && pip install tox==1.9.2 setuptools     && luarocks install busted     && install --owner=nobody --group=nogroup --directory /srv/workspace' returned a non-zero code: 1 (image.py:228)
Traceback (most recent call last):
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker_pkg/image.py", line 224, in build
    super().build(self.build_path)
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker_pkg/image.py", line 141, in build
    buildargs=self.buildargs
  File "/srv/deployment/docker-pkg/venv/lib/python3.4/site-packages/docker/models/images.py", line 175, in build
    raise BuildError(chunk['error'])
docker.errors.BuildError: The command '/bin/sh -c echo 'Acquire::http::Proxy "http://webproxy.eqiad.wmnet:8080";' > /etc/apt/apt.conf.d/80_proxy     && apt-get update      && DEBIAN_FRONTEND=noninteractive     apt-get install  --yes build-essential bundler python-dev     python-pip rubygems-integration rake ruby ruby-dev ca-certificates libmysqlclient-dev     mtail isc-dhcp-server luarocks --no-install-recommends     && rm -f /etc/apt/apt.conf.d/80_proxy     && apt-get clean && rm -rf /var/lib/apt/lists/*      && pip install pip==8.1.2     && pip install tox==1.9.2 setuptools     && luarocks install busted     && install --owner=nobody --group=nogroup --directory /srv/workspace' returned a non-zero code: 1
2018-08-14 16:04:09,646 [docker-pkg-build] INFO - Removing build context /tmp/docker-pkg-operations-puppet9mhqhmvm (image.py:273)
reedy@contint1001:/tmp$

Thanks @Reedy! The luarocks part fails with:

Warning: Failed searching manifest: Failed extracting manifest file
Installing https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/busted-2.0.rc12-1.rockspec...
Using https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/busted-2.0.rc12-1.rockspec... switching to 'build' mode

Missing dependencies for busted:
penlight >= 1.3.2-2
lua-term >= 0.1-1
dkjson >= 2.1.0
lua_cliargs == 3.0-1
luasystem >= 0.2.0-0
say >= 1.3-0
luafilesystem >= 1.5.0
luassert >= 1.7.8-0
mediator_lua >= 1.1.1-0

Warning: Failed searching manifest: Failed extracting manifest file
Using https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/penlight-1.5.4-1.rockspec... switching to 'build' mode

Missing dependencies for penlight:
luafilesystem

Warning: Failed searching manifest: Failed extracting manifest file
Using https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/luafilesystem-1.7.0-2.src.rock... switching to 'build' mode

Error: Failed installing dependency: https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/penlight-1.5.4-1.rockspec - Failed installing dependency: https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/luafilesystem-1.7.0-2.src.rock - Failed unpacking rock file: /tmp/luarocks_luarocks-rock-luafilesystem-1.7.0-2-8485/luafilesystem-1.7.0-2.src.rock

Which a bit of research indicates could be due to unzip not being installed on the system.

Yay, dependancies.

Feel free to bump the package again and add unzip and I can try again

Yay, dependancies.

Yeah. Note that the version of luarocks in stretch does depend on unzip, it's the jessie version that does not.

Least unzip isn't a heavyweight dependancy :)

Change 452714 had a related patch set uploaded (by Ema; owner: Ema):
[integration/config@master] operations-puppet: install unzip, required by luarocks

https://gerrit.wikimedia.org/r/452714

Change 452714 merged by jenkins-bot:
[integration/config@master] operations-puppet: install unzip, required by luarocks

https://gerrit.wikimedia.org/r/452714

Change 452692 merged by jenkins-bot:
[integration/config@master] Use new operations-puppet image with Lua support

https://gerrit.wikimedia.org/r/452692

Change 451838 merged by Ema:
[operations/puppet@production] ATS: add Lua scripting support

https://gerrit.wikimedia.org/r/451838

Change 452612 merged by Ema:
[operations/puppet@production] tox: add ts-lua tests for trafficserver

https://gerrit.wikimedia.org/r/452612

Change 452941 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: route traffic to api, RB, Swift, phab-aphlict

https://gerrit.wikimedia.org/r/452941

Change 452941 merged by Ema:
[operations/puppet@production] ATS: route traffic to api, RB, Swift, phab-aphlict

https://gerrit.wikimedia.org/r/452941

Change 453111 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: fix routing to Restbase

https://gerrit.wikimedia.org/r/453111

Change 453111 merged by Ema:
[operations/puppet@production] ATS: fix routing to Restbase

https://gerrit.wikimedia.org/r/453111

Change 453164 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: storage configuration

https://gerrit.wikimedia.org/r/453164

Change 453164 merged by Ema:
[operations/puppet@production] ATS: storage configuration

https://gerrit.wikimedia.org/r/453164

Change 453960 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: allow to specify caching rules

https://gerrit.wikimedia.org/r/453960

Change 453960 merged by Ema:
[operations/puppet@production] ATS: add caching rules support

https://gerrit.wikimedia.org/r/453960

Change 457500 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: fix people.w.o never-cache rule

https://gerrit.wikimedia.org/r/457500

Change 457500 merged by Ema:
[operations/puppet@production] ATS: fix people.w.o never-cache rule

https://gerrit.wikimedia.org/r/457500

Change 457913 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: pass hostname as an argument to default Lua scripts

https://gerrit.wikimedia.org/r/457913

Change 457913 merged by Ema:
[operations/puppet@production] ATS: pass hostname as an argument to default Lua scripts

https://gerrit.wikimedia.org/r/457913

Change 457927 had a related patch set uploaded (by Ema; owner: Ema):
[operations/software/varnish/vhtcpd@master] 0.1.2: Consider 404 responses as valid

https://gerrit.wikimedia.org/r/457927

Change 457927 merged by Ema:
[operations/software/varnish/vhtcpd@master] 0.1.2: Consider 404 responses as valid

https://gerrit.wikimedia.org/r/457927

Change 457934 had a related patch set uploaded (by Ema; owner: Ema):
[operations/software/varnish/vhtcpd@debian] vhtcpd (0.1.2-1) stretch-wikimedia; urgency=medium

https://gerrit.wikimedia.org/r/457934

Change 457934 merged by Ema:
[operations/software/varnish/vhtcpd@debian] vhtcpd (0.1.2-1) stretch-wikimedia; urgency=medium

https://gerrit.wikimedia.org/r/457934

Mentioned in SAL (#wikimedia-operations) [2018-09-05T07:49:07Z] <ema> upload vhtcpd 0.1.2-1 to stretch-wikimedia T199720

Change 458168 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: status code 15 means successful exit

https://gerrit.wikimedia.org/r/458168

Change 458168 merged by Ema:
[operations/puppet@production] ATS: status code 15 means successful exit

https://gerrit.wikimedia.org/r/458168

Change 458195 had a related patch set uploaded (by Ema; owner: Ema):
[operations/debs/trafficserver@master] trafficserver (7.1.3+ds-4wm3) stretch-wikimedia; urgency=medium

https://gerrit.wikimedia.org/r/458195

Change 458195 merged by Ema:
[operations/debs/trafficserver@master] trafficserver (7.1.3+ds-4wm3) stretch-wikimedia; urgency=medium

https://gerrit.wikimedia.org/r/458195

Mentioned in SAL (#wikimedia-traffic) [2018-09-06T12:51:36Z] <ema> trafficserver 7.1.3+ds-4wm3 uploaded to stretch-wikimedia T199720

Change 458536 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: specify mapping rules for all text/upload backends

https://gerrit.wikimedia.org/r/458536

Change 458536 merged by Ema:
[operations/puppet@production] ATS: specify mapping rules for all text/upload backends

https://gerrit.wikimedia.org/r/458536

Change 458829 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: websockets support for etherpad/phab, kartotherian port

https://gerrit.wikimedia.org/r/458829

Change 458829 merged by Ema:
[operations/puppet@production] ATS: websockets support for etherpad/phab, kartotherian port

https://gerrit.wikimedia.org/r/458829

Change 459520 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS request routing: fix yarn, remove scolarships

https://gerrit.wikimedia.org/r/459520

Change 459520 merged by Ema:
[operations/puppet@production] ATS request routing: fix yarn and scholarships

https://gerrit.wikimedia.org/r/459520

ema updated the task description. (Show Details)

Request routing to all current applications added. Closing!