chasemp (Chase)Administrator
Lead Operations Engineer (Wikimedia Cloud Services)

Projects (27)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Sep 16 2014, 11:39 AM (136 w, 3 d)
Roles
Administrator
Availability
Available
IRC Nick
chasemp
LDAP User
Rush
MediaWiki User
CPettet (WMF)

Recent Activity

Yesterday

chasemp claimed T164123: tools-k8s-master-01 has two floating IPs.
Fri, Apr 28, 10:19 PM · Operations, Labs
chasemp created T164123: tools-k8s-master-01 has two floating IPs.
Fri, Apr 28, 10:19 PM · Operations, Labs
chasemp updated subscribers of T164103: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia.

Friendly ping for @Bawolff and @dpatrick. I'm not sure if these are approved somewhere else in this capacity or not but I'm trying to error on the side of prudence since I have not been able to locate.

Fri, Apr 28, 8:18 PM · Security, Labs
chasemp added a project to T164103: Generate labsdb views for dtywiki, pawikisource, ptwikimedia, wbwikimedia: Security.

hey Security folks can we can a sign off on creating the normal views for these four wiki's on the labs DB replicas?

Fri, Apr 28, 6:59 PM · Security, Labs
chasemp added a comment to T162945: The future of service groups and service users on Labs.

All the needs I have had for this feature are in relation to toolsbeta or working on the Tools environment directly fwiw. So +1 from me.

Fri, Apr 28, 6:31 PM · MediaWiki-extensions-OpenStackManager, Tool-Labs, Labs

Thu, Apr 27

chasemp closed T163390: Update documentation for Tools Proxy failover as "Resolved".

thanks, calling it good for now

Thu, Apr 27, 9:57 PM · Operations, Labs
chasemp reassigned T160611: Make "linter" table available on Labs from chasemp to Andrew.

thanks @Andrew

Thu, Apr 27, 9:56 PM · Patch-For-Review, DBA, Labs, MediaWiki-extensions-Linter
chasemp claimed T160611: Make "linter" table available on Labs.

Approved by security in https://phabricator.wikimedia.org/T148583#2854927

Thu, Apr 27, 5:38 PM · Patch-For-Review, DBA, Labs, MediaWiki-extensions-Linter

Wed, Apr 26

chasemp added a comment to T163823: During labservices1001 failover fqdn changed from foo.project.eqiad.wmflabs to foo.eqiad.wmflabs.

@Andrew looking at this from another angle let's say we don't know the conditions that caused all FQDN to suddenly omit project, can we narrow it down to a potential metadata service issue and force that do we then see the same behavior where files are populated with teh truncated FQDN?

Wed, Apr 26, 6:38 PM · Operations, Labs

Tue, Apr 25

chasemp added a comment to T161327: bootstrap_vz: Move firstboot.sh out of the base image?.

I made two rough pitches for this during our convo:

Tue, Apr 25, 10:36 PM · Labs-Infrastructure, Labs
chasemp edited the description of T163823: During labservices1001 failover fqdn changed from foo.project.eqiad.wmflabs to foo.eqiad.wmflabs.
Tue, Apr 25, 8:49 PM · Operations, Labs
chasemp added a comment to T163823: During labservices1001 failover fqdn changed from foo.project.eqiad.wmflabs to foo.eqiad.wmflabs.

I see a few requested certs for the foo.eqiad.wmflabs pattern on the Tools puppet master:

Tue, Apr 25, 8:43 PM · Operations, Labs
chasemp created T163823: During labservices1001 failover fqdn changed from foo.project.eqiad.wmflabs to foo.eqiad.wmflabs.
Tue, Apr 25, 8:39 PM · Operations, Labs
chasemp created P5330 (An Untitled Masterwork).
Tue, Apr 25, 8:15 PM
chasemp triaged T163796: Audit disk usage on labvirts as "High" priority.
Tue, Apr 25, 3:55 PM · Labs-Infrastructure, Labs
chasemp added a comment to T161899: Investigate ceasing self-service new Trusty instance creation in Labs.

fwiw's the second is what I intended, a better title here would be Investigate ceasing self-service new Trusty instance creation in Labs. That's on me, I thought that was clearer.

Tue, Apr 25, 2:44 PM · Operations, Labs
chasemp renamed T161899: Investigate ceasing self-service new Trusty instance creation in Labs from "Investigate ceasing new Trusty instance creation in Labs" to "Investigate ceasing self-service new Trusty instance creation in Labs".
Tue, Apr 25, 2:43 PM · Operations, Labs
chasemp added a comment to T161899: Investigate ceasing self-service new Trusty instance creation in Labs.

For fwiw's the second is what I intended, a better title here would be Investigate ceasing self-service new Trusty instance creation in Labs. That's on me, I thought that was clearer.

Tue, Apr 25, 2:06 PM · Operations, Labs
chasemp added a comment to T161899: Investigate ceasing self-service new Trusty instance creation in Labs.

As soon as we disable Trusty we'll also be violating 'cattle, not pets' for most of our users. It will mean that anytime they need to recreate an instance they will also have to learn how to configure a new OS and adapt their work to run there.

We can't even create an SGE node on Jessie yet, so this kind of move seems highly premature.

Tue, Apr 25, 1:52 PM · Operations, Labs

Mon, Apr 24

chasemp added a comment to T161473: Stop requiring two-factor authentication for horizon.wikimedia.org.

We have always required some level of administrators to use 2fa, and have defaulted to it in the Phabricator admin case and a few others. I'm not sure if @csteipp will have time to weigh in but my recollection of the sentiment is this: these are privileged level accounts and there is no undo or revert for the actions of project administrators. These types of accounts should require 2fa by default and be reasoned backwards as the exception.

Mon, Apr 24, 12:49 PM · Labs, Horizon

Fri, Apr 21

chasemp triaged T163611: Temporary Tool Labs projectadmin right for Tgr as "Normal" priority.
Fri, Apr 21, 10:35 PM · User-bd808, Labs, Tool-Labs
chasemp added a comment to T163611: Temporary Tool Labs projectadmin right for Tgr.

+1 thanks @Tgr for being a great citizen

Fri, Apr 21, 10:27 PM · User-bd808, Labs, Tool-Labs
chasemp updated subscribers of T161898: IO issues for Tools instances flapping with iowait and puppet failure.

I think we should create 10 tools-webgrid-lighttpd-14* instances to make up for the 20 lost precise ones and see how jobs and load shift. Making 2 tools-webgrid-generic instances seems wise as well.

Fri, Apr 21, 4:16 PM · Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

404 https://tools.wmflabs.org/chie-bot/

Fri, Apr 21, 4:06 PM · Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

Is this meant to be up https://tools.wmflabs.org/hazard-bot/ ?

Fri, Apr 21, 3:59 PM · Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

Puppet errors (previously Puppet run) alerts

Fri, Apr 21, 3:45 PM · Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

I think we should create 10 tools-webgrid-lighttpd-14* instances to make up for the 20 lost precise ones and see how jobs and load shift. Making 2 tools-webgrid-generic instances seems wise as well.

Fri, Apr 21, 3:37 PM · Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

iowait errors over the past few weeks:

Fri, Apr 21, 3:36 PM · Labs
chasemp added a comment to T163192: s51053 is running unnecessarily long running queries on revision.

Just a note for reference, we do have reconciliation logic that looks at ldap and creates users but it only looks at whether the account exists natively and does not try to compare password :)

Fri, Apr 21, 2:07 PM · Labs, Tool-Labs
chasemp updated subscribers of T163192: s51053 is running unnecessarily long running queries on revision.

@chasemp @bd808 I got no answer from the user in a week's time. Fearing that the account may be unmaintained, and given the state of labsdb1001 regarding memory usage (crashing very frequently) I am going to disable the user completely, so I can increase the limits on queries for the rest of the users. The queries that are being executed are not finishing anyway because how long they are, so I am not breaking anything that wasn't already broken- only avoiding it takes precious resources from the other users.

Fri, Apr 21, 2:02 PM · Labs, Tool-Labs

Thu, Apr 20

chasemp added a comment to T153099: Initial OpenStack Neutron PoC deployment in Labtest.

some key points I have taken ss of:

Thu, Apr 20, 7:05 PM · Labs, Operations
chasemp reassigned T163390: Update documentation for Tools Proxy failover from chasemp to Andrew.

https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource%3ATools%2FAdmin&type=revision&diff=1757036&oldid=1756658

Thu, Apr 20, 6:04 PM · Operations, Labs
chasemp added a comment to T163390: Update documentation for Tools Proxy failover.

first pass https://wikitech.wikimedia.org/w/index.php?title=Nova_Resource%3ATools%2FAdmin&type=revision&diff=1757014&oldid=1756658

Thu, Apr 20, 3:43 PM · Operations, Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

I see one maintainer for cobot and I can't find a Phab account for them https://wikitech.wikimedia.org/wiki/Shell_Request/MistrX

Thu, Apr 20, 1:27 PM · Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

!log tools.cobot set crons to schedule on tools-exec-1422 only for testing. This tool is launching jobs that write and then read hundreds of megs in a few minutes. I caught it tripping up puppet on tools-exec-1437

Thu, Apr 20, 1:12 PM · Labs
chasemp updated subscribers of T163439: wikitech logging constant errors from /MemcachedPeclBagOStuff.php.

@bd808 and @Andrew I think possibly you guys were doing something related here recently?

Thu, Apr 20, 12:40 PM · Operations, Labs
chasemp triaged T163439: wikitech logging constant errors from /MemcachedPeclBagOStuff.php as "Normal" priority.
Thu, Apr 20, 12:39 PM · Operations, Labs
chasemp created T163439: wikitech logging constant errors from /MemcachedPeclBagOStuff.php.
Thu, Apr 20, 12:38 PM · Operations, Labs

Wed, Apr 19

chasemp updated subscribers of T148506: Rack and setup new eqiad row D switch stack (EX4300/QFX5100).

FYI @Andrew labservices1001 will be caught up in this as it lives in D3. Previously we had some issues with that host being offline where labservices1002 was not standing in as expected IIRC. It seems like the outage is expected to be brief :) but I created T163402 to sort it out before the 26th because you never know.

Wed, Apr 19, 11:55 PM · Patch-For-Review, Operations, ops-eqiad, netops
chasemp edited the description of T163402: Ensure we can survive a loss of labservices1001.
Wed, Apr 19, 11:53 PM · Patch-For-Review, Operations, Labs
chasemp triaged T163402: Ensure we can survive a loss of labservices1001 as "High" priority.
Wed, Apr 19, 11:53 PM · Patch-For-Review, Operations, Labs
chasemp created T163402: Ensure we can survive a loss of labservices1001.
Wed, Apr 19, 11:53 PM · Patch-For-Review, Operations, Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

Are you sure you used the IO amount to get your report? I did a fix in phetools, but I didn't get why I was mentioned in this report, it's unclear if it was worth or not.

Wed, Apr 19, 11:45 PM · Labs
chasemp edited the description of T163390: Update documentation for Tools Proxy failover.
Wed, Apr 19, 11:02 PM · Operations, Labs
chasemp triaged T163393: Determine appropriate proxy_read_timeout setting for Tools Proxy as "Normal" priority.
Wed, Apr 19, 10:22 PM · Operations, Labs
chasemp created T163393: Determine appropriate proxy_read_timeout setting for Tools Proxy.
Wed, Apr 19, 10:22 PM · Operations, Labs
chasemp edited the description of T163390: Update documentation for Tools Proxy failover.
Wed, Apr 19, 10:16 PM · Operations, Labs
chasemp edited the description of T163390: Update documentation for Tools Proxy failover.
Wed, Apr 19, 10:16 PM · Operations, Labs
chasemp edited the description of T163390: Update documentation for Tools Proxy failover.
Wed, Apr 19, 10:15 PM · Operations, Labs
chasemp triaged T163391: Ensure kubelet is stopped on Tools Proxy hosts as "High" priority.
Wed, Apr 19, 10:12 PM · Patch-For-Review, Operations, Labs
chasemp created T163391: Ensure kubelet is stopped on Tools Proxy hosts.
Wed, Apr 19, 10:05 PM · Patch-For-Review, Operations, Labs
chasemp triaged T163390: Update documentation for Tools Proxy failover as "Normal" priority.
Wed, Apr 19, 10:02 PM · Operations, Labs
chasemp created T163390: Update documentation for Tools Proxy failover.
Wed, Apr 19, 10:02 PM · Operations, Labs
chasemp added a comment to P5288 (An Untitled Masterwork).
Wed Apr 19 21:55:19 UTC 2017
root@tools-proxy-02:~# df -Th
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs   10M     0   10M   0% /dev
tmpfs          tmpfs     792M  104M  688M  14% /run
/dev/vda3      ext4       19G  3.5G   15G  20% /
tmpfs          tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs          tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs          tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
root@tools-proxy-02:~# cd /var/log
root@tools-proxy-02:/var/log# du -sh
587M	.
Wed, Apr 19, 9:55 PM
chasemp added a comment to P5288 (An Untitled Masterwork).

Wed Apr 19 19:19:10 UTC 2017
root@tools-proxy-02:/var/log# df -Th
Filesystem Type Size Used Avail Use% Mounted on
udev devtmpfs 10M 0 10M 0% /dev
tmpfs tmpfs 792M 104M 688M 14% /run
/dev/vda3 ext4 19G 3.4G 15G 19% /
tmpfs tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
root@tools-proxy-02:/var/log# du -sh
474M .

Wed, Apr 19, 7:19 PM
chasemp added a comment to P5288 (An Untitled Masterwork).
Wed Apr 19 16:43:35 UTC 2017
root@tools-proxy-02:/var/log# df -Th
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs   10M     0   10M   0% /dev
tmpfs          tmpfs     792M  104M  688M  14% /run
/dev/vda3      ext4       19G  3.3G   15G  19% /
tmpfs          tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs          tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs          tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
root@tools-proxy-02:/var/log# du -sh
355M
Wed, Apr 19, 4:43 PM
chasemp added a comment to T163208: wsexport tool writing output to $HOME/tool/temp puts load on Tool Labs NFS server.

chasemp_freenode_#wikimedia-labs_20170418.log

1 tools-exec-1430
1 tools-exec-1437
1 tools-exec-1439
1 tools-exec-1442
2 tools-exec-1435
3 tools-exec-1434
3 tools-exec-1441
4 tools-exec-1432
5 tools-exec-1436
6 tools-exec-1433

chasemp_freenode_#wikimedia-labs_20170419.log

1 tools-exec-1435
1 tools-exec-1437
1 tools-exec-1439
1 tools-exec-1442
2 tools-exec-1430
2 tools-exec-1441
3 tools-exec-1432
Wed, Apr 19, 4:40 PM · Tool-Labs-tools-Other, Labs
chasemp added a comment to P5288 (An Untitled Masterwork).
Wed Apr 19 16:18:56 UTC 2017
root@tools-proxy-02:/var/log/nginx# df -Th
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs   10M     0   10M   0% /dev
tmpfs          tmpfs     792M  104M  688M  14% /run
/dev/vda3      ext4       19G  3.2G   15G  19% /
tmpfs          tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs          tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs          tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
root@tools-proxy-02:/var/log/nginx# cd ..
root@tools-proxy-02:/var/log# du -sh
336M
Wed, Apr 19, 4:19 PM
chasemp added a comment to P5288 (An Untitled Masterwork).
Wed Apr 19 16:07:35 UTC 2017
root@tools-proxy-02:/var/log# df -Th
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs   10M     0   10M   0% /dev
tmpfs          tmpfs     792M  104M  688M  14% /run
/dev/vda3      ext4       19G  3.2G   15G  19% /
tmpfs          tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs          tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs          tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
root@tools-proxy-02:/var/log# du -sh
328M	.
Wed, Apr 19, 4:08 PM
chasemp created P5289 (An Untitled Masterwork).
Wed, Apr 19, 3:49 PM
chasemp added a comment to P5288 (An Untitled Masterwork).
root@tools-proxy-02:/etc/nginx# date
Wed Apr 19 15:37:49 UTC 2017
root@tools-proxy-02:/etc/nginx# df -Th
Filesystem     Type      Size  Used Avail Use% Mounted on
udev           devtmpfs   10M     0   10M   0% /dev
tmpfs          tmpfs     792M  104M  688M  14% /run
/dev/vda3      ext4       19G  3.2G   15G  18% /
tmpfs          tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs          tmpfs     5.0M     0  5.0M   0% /run/lock
tmpfs          tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
root@tools-proxy-02:/etc/nginx# du -sh
152K
Wed, Apr 19, 3:38 PM
chasemp added a comment to P5288 (An Untitled Masterwork).

Wed Apr 19 15:27:54 UTC 2017
root@tools-proxy-02:/var/log# df -Th
Filesystem Type Size Used Avail Use% Mounted on
udev devtmpfs 10M 0 10M 0% /dev
tmpfs tmpfs 792M 106M 687M 14% /run
/dev/vda3 ext4 19G 3.2G 15G 18% /
tmpfs tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
root@tools-proxy-02:/var/log# du -sh
286M

Wed, Apr 19, 3:28 PM
chasemp created P5288 (An Untitled Masterwork).
Wed, Apr 19, 3:19 PM
chasemp added a comment to P5287 (An Untitled Masterwork).
Wed, Apr 19, 3:06 PM
chasemp triaged T163336: kube-proxy pulls in docker and starts service even when it isnt needed as "Normal" priority.
Wed, Apr 19, 3:06 PM · Operations, Labs
chasemp created T163336: kube-proxy pulls in docker and starts service even when it isnt needed.
Wed, Apr 19, 3:05 PM · Operations, Labs
chasemp created P5287 (An Untitled Masterwork).
Wed, Apr 19, 2:57 PM

Tue, Apr 18

chasemp added a comment to T142166: Create a new labs flavor available to all project: largedisk.

I think I oppose this being available by default to all projects as we cannot quota disk space usage effectively. I have no problem with it being available as part of a quota increase request.

Tue, Apr 18, 7:24 PM · Labs, Labs-Infrastructure
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

I filed https://github.com/wsexport/tool/issues/127 with the wsexport tool.

Tue, Apr 18, 1:54 PM · Labs

Mon, Apr 17

chasemp added a comment to T162629: Admin request for user paladox and Luke081515 in the project shinken.

@chasemp Hi, any update on this please? I would like to start testing https://gerrit.wikimedia.org/r/#/c/347640/ :)

Mon, Apr 17, 8:37 PM · Shinken, Monitoring, Labs
chasemp added a comment to P5283 (An Untitled Masterwork).

root@tools-docker-builder-04:~# cd /srv/images/toollabs
root@tools-docker-builder-04:/srv/images/toollabs# ./build.py base
Building following images: ['base', 'nodejs/base', 'python2/base', 'php/base', 'tcl/base', 'static-web', 'golang/base', 'ruby/base', 'python/base', 'jdk8/base', 'nodejs/web', 'python2/web', 'php/web', 'tcl/web', 'golang/web', 'ruby/web', 'python/web', 'jdk8/web']
Sending build context to Docker daemon 6.144 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/wikimedia-jessie
---> 35690d23731d
Step 2 : ADD tools.list /etc/apt/sources.list.d/tools.list
---> Using cache
---> 9f4f2320b9c2
Step 3 : RUN apt-get update
---> Using cache
---> b184659e1c24
Step 4 : RUN DEBIAN_FRONTEND=noninteractive apt-get install --yes --no-install-recommends libnss-ldapd locales
---> Using cache
---> 8a6fb358170d
Step 5 : ADD nsswitch.conf /etc/nsswitch.conf
---> Using cache
---> 8af7a8625ca8
Step 6 : RUN sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen && dpkg-reconfigure --frontend=noninteractive locales && update-locale LANG=en_US.UTF-8
---> Using cache
---> 59325f551408
Step 7 : ENV LC_ALL en_US.UTF-8
---> Using cache
---> 1dd0142143a4
Step 8 : RUN apt-get install --yes --no-install-recommends git less nano vim emacs curl sed gawk jq
---> Using cache
---> 9426481891d9
Successfully built 9426481891d9
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes npm nodejs-legacy
---> Using cache
---> 47e2224c0b4e
Successfully built 47e2224c0b4e
Sending build context to Docker daemon 3.584 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes python python-dev build-essential python-virtualenv
---> Using cache
---> dae57dccccc1
Step 3 : RUN apt-get install --yes libxml2-dev libxslt-dev zlib1g-dev
---> Using cache
---> 642740396db4
Step 4 : RUN apt-get install --yes libmysqlclient-dev
---> Using cache
---> 6fe3f73f169a
Step 5 : RUN apt-get install --yes libenchant-dev
---> Using cache
---> e6d5c40efa89
Step 6 : RUN apt-get install --yes libicu-dev
---> Using cache
---> a8a4b4698674
Successfully built a8a4b4698674
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes php5-apcu php5-cli php5-curl php5-gd php5-imagick php5-intl php5-mcrypt php5-mysqlnd php5-pgsql php5-redis php5-sqlite php5-xsl
---> Using cache
---> d707d87d6e42
Successfully built d707d87d6e42
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes tcl mysqltcl tcl-tls tcl-trf tcllib tdom tclcurl tcl-thread
---> Using cache
---> df3cd5b21875
Successfully built df3cd5b21875
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes lighttpd
---> Using cache
---> 4224214d74d7
Step 3 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> c92678ff5110
Step 4 : RUN chmod 0777 /var/run/lighttpd/
---> Using cache
---> bbca02d2de6f
Successfully built bbca02d2de6f
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes -t jessie-backports golang-go
---> Using cache
---> c83656c3a91b
Successfully built c83656c3a91b
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes ruby ruby-dev build-essential
---> Using cache
---> 76e65f4edda5
Successfully built 76e65f4edda5
Sending build context to Docker daemon 3.584 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes python3 python3-dev build-essential python3-virtualenv python3-venv
---> Using cache
---> 87bea86aef6d
Step 3 : RUN apt-get install --yes libxml2-dev libxslt-dev zlib1g-dev
---> Using cache
---> 95d561644f02
Step 4 : RUN apt-get install --yes libenchant-dev
---> Using cache
---> 32e918e5c2d4
Step 5 : RUN apt-get install --yes libicu-dev
---> Using cache
---> aa8c9e78a65c
Successfully built aa8c9e78a65c
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-base
---> 9426481891d9
Step 2 : RUN apt-get install --yes -t jessie-backports openjdk-8-jdk maven
---> Using cache
---> 0a3e53585e37
Successfully built 0a3e53585e37
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-nodejs-base
---> 47e2224c0b4e
Step 2 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> 9974100f1858
Successfully built 9974100f1858
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-python2-base
---> a8a4b4698674
Step 2 : RUN apt-get install --yes uwsgi uwsgi-plugin-python
---> Using cache
---> 3b959426e04c
Step 3 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> 6a6bfdc044e3
Successfully built 6a6bfdc044e3
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-php-base
---> d707d87d6e42
Step 2 : RUN apt-get install --yes lighttpd php5-cgi fam
---> Using cache
---> afb5c8fd5ac3
Step 3 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> fb15bacb4e8d
Step 4 : RUN chmod 0777 /var/run/lighttpd/
---> Using cache
---> e8a4c6541e7b
Successfully built e8a4c6541e7b
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-tcl-base
---> df3cd5b21875
Step 2 : RUN apt-get install --yes lighttpd libfcgi-dev
---> Using cache
---> 7f105a7c655d
Step 3 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> 76d15f4805ed
Step 4 : RUN chmod 0777 /var/run/lighttpd/
---> Using cache
---> 9d5a021b2e35
Successfully built 9d5a021b2e35
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-golang-base
---> c83656c3a91b
Step 2 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> c5ae0f51bf6d
Successfully built c5ae0f51bf6d
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-ruby-base
---> 76e65f4edda5
Step 2 : RUN apt-get install --yes unicorn
---> Using cache
---> e62fb43c933c
Step 3 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> 547fce0b784a
Successfully built 547fce0b784a
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-python-base
---> aa8c9e78a65c
Step 2 : RUN apt-get install --yes uwsgi uwsgi-plugin-python3
---> Using cache
---> f0413c3f5a66
Step 3 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> 91814b957090
Successfully built 91814b957090
Sending build context to Docker daemon 3.072 kB
Step 1 : FROM docker-registry.tools.wmflabs.org/toollabs-jdk8-base
---> 0a3e53585e37
Step 2 : RUN apt-get install --yes toollabs-webservice
---> Using cache
---> a47ecbf39740
Successfully built a47ecbf39740
root@tools-docker-builder-04:/srv/images/toollabs#

Mon, Apr 17, 7:59 PM
chasemp added a comment to T162534: Requesting more disk space a Wikiapiary project instance.

storage is not quota'd in the same fashion as RAM or CPU. I think what you guys want is described in https://wikitech.wikimedia.org/wiki/Help:Adding_Disk_Space considering this is already a large instance.

Mon, Apr 17, 12:41 PM · Labs-Infrastructure, WikiApiary, Labs

Thu, Apr 13

chasemp added a comment to T162955: rebuild tools-grid-master as a large instance.

If this was successful I wonder if we could easily bake in http://wiki.gridengine.info/wiki/index.php/RQS_Common_Uses#Max_user_jobs_in_a_particular_queue

Thu, Apr 13, 10:00 PM · Operations, Labs
chasemp created T162955: rebuild tools-grid-master as a large instance.
Thu, Apr 13, 9:51 PM · Operations, Labs
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

I noticed we saw puppet failures from tools-exec-1432 this morning so I decided to take a look.

Thu, Apr 13, 7:12 PM · Labs
chasemp renamed T161898: IO issues for Tools instances flapping with iowait and puppet failure from "iowait alerts for grid engine nodes" to "IO issues for Tools instances flapping with iowait and puppet failure".
Thu, Apr 13, 6:30 PM · Labs

Wed, Apr 12

chasemp closed T162772: Disable 2FA for Freddy2001 on Wikitech as "Resolved".

Ok I worked with @Freddy2001 to confirm they can access their Tools shell account and edit a file (/home/freddy2001/unlock-my-2fa) restricted only to that user in the home directory. I have now removed 2fa protection from this account so that it can be setup anew with a new device.

Wed, Apr 12, 3:53 PM · Labs, wikitech.wikimedia.org
chasemp added a comment to T162134: Request creation of Discourse for Wiki Asian Month labs project.

It makes much more sense to me now. A short name discourse-wam would make it easier for future configuration. I edited the project name for clarity. Thank you for the advice.

Wed, Apr 12, 2:40 PM · Labs
chasemp triaged T162772: Disable 2FA for Freddy2001 on Wikitech as "Normal" priority.

Are you a member of the tools project? Our guidance on this is basically to sync up on irc with an admin and confirm you have your SSH key by editing a file in your home directory on an instance.

Wed, Apr 12, 2:32 PM · Labs, wikitech.wikimedia.org
chasemp added a comment to T162134: Request creation of Discourse for Wiki Asian Month labs project.

It makes a lot of sense to keep names abbreviated and concise and avoid alphanumeric complexity as much as possible. If I'm creating this for myself I would go for discourse-wam and there will be a project page to expand and explain what the project is for. This will indeed be part of the FQDN for hosts and have to fit in a number of places and dialogue boxes easily.

Wed, Apr 12, 2:29 PM · Labs
chasemp added a comment to T162462: Standalone puppet masters are broken (uninstallable packages).

I'm getting clean puppet runs on labs instances with role::puppetmaster::standalone now. So, the labs case for this looks resolved -- anything else to do here?

Wed, Apr 12, 2:23 PM · Patch-For-Review, Labs, Operations

Tue, Apr 11

chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

AFAICT this is the check that is alerting:

Tue, Apr 11, 5:33 PM · Labs
chasemp triaged T161951: tools.iabot is overloading the grid by running too many workers in parallel as "Normal" priority.

I wasn't aware that big brother can work on jobs other than the web service. Thank you for pointing that out. Now that you mention it, 20 workers was indeed overkill, and I didn't expect that 20 idling scripts doing almost nothing would cause problems. In that case since use case is not very high at the moment, I have no issues only using 1 worker until the job queue rises out of control.

I am sorry for the inconvenience.

Tue, Apr 11, 5:32 PM · Tool-Labs, Labs, InternetArchiveBot
chasemp added a comment to T161898: IO issues for Tools instances flapping with iowait and puppet failure.

We did reintroduce these and it seems to have had a positive effect. We are still seeing the alerts and it does seem like a systemic issue across. If I had to guess we have at least a few factors here but almost certainly we are starved for trusty exec nodes.

Tue, Apr 11, 5:30 PM · Labs
chasemp added a comment to T162090: Investigate alternative RAID strategies for labstore1001/2.

If performance allows it would be great to get RAID 50 esp since this is a 2 node HA cluster. We could finally do the beginnings of real (but limited) user backups.

Tue, Apr 11, 5:25 PM · Labs, Operations
chasemp added a comment to T162534: Requesting more disk space a Wikiapiary project instance.

storage is not quota'd in the same fashion as RAM or CPU. I think what you guys want is described in https://wikitech.wikimedia.org/wiki/Help:Adding_Disk_Space considering this is already a large instance.

Tue, Apr 11, 4:26 PM · Labs-Infrastructure, WikiApiary, Labs
chasemp renamed T162534: Requesting more disk space a Wikiapiary project instance from "Requesting more disk space" to "Requesting more disk space a Wikiapiary project instance".
Tue, Apr 11, 4:21 PM · Labs-Infrastructure, WikiApiary, Labs
chasemp added a comment to T162134: Request creation of Discourse for Wiki Asian Month labs project.

I am opposed to spaces and capticals in a project name in openstack :) but discourse-for-wiki-asian-month seems more than fine to create. +1

Tue, Apr 11, 4:17 PM · Labs
chasemp added a comment to T161834: Undo special tools-home and tools-project share definitions for NFS.

By refactoring so that the paths used in tools for the share links are common to the rest of the projects I meant this :)

Tue, Apr 11, 3:18 PM · Labs, Operations
chasemp triaged T162640: labvirt1002 ignoring some messages as "High" priority.
Tue, Apr 11, 1:36 PM · Patch-For-Review, Labs-Infrastructure, Labs
chasemp reassigned T162691: Deleting puppet-paladox instance is not deleting it from chasemp to Andrew.

I am pretty sure a restart of nova-compute will fix this but I'm going to let it be so you can take a look @Andrew

Tue, Apr 11, 1:36 PM · Labs
chasemp triaged T162691: Deleting puppet-paladox instance is not deleting it as "Normal" priority.
Tue, Apr 11, 1:36 PM · Labs
chasemp added subtasks for T162529: OpenStack instances stuck in deletion state: T162691: Deleting puppet-paladox instance is not deleting it, T162640: labvirt1002 ignoring some messages.
Tue, Apr 11, 1:35 PM · Patch-For-Review, Labs-Infrastructure, Labs, Continuous-Integration-Infrastructure
chasemp added a parent task for T162640: labvirt1002 ignoring some messages: T162529: OpenStack instances stuck in deletion state.
Tue, Apr 11, 1:35 PM · Patch-For-Review, Labs-Infrastructure, Labs
chasemp added a parent task for T162691: Deleting puppet-paladox instance is not deleting it: T162529: OpenStack instances stuck in deletion state.
Tue, Apr 11, 1:35 PM · Labs
chasemp reassigned T162529: OpenStack instances stuck in deletion state from chasemp to Andrew.

@Andrew was looking into this. Another instance was reported on IRC in T162691: Deleting puppet-paladox instance is not deleting it today and I have checked it is indeed | OS-EXT-SRV-ATTR:hypervisor_hostname | labvirt1002.eqiad.wmnet

Tue, Apr 11, 1:33 PM · Patch-For-Review, Labs-Infrastructure, Labs, Continuous-Integration-Infrastructure
chasemp added a comment to T162629: Admin request for user paladox and Luke081515 in the project shinken.

@Paladox wants to take a shot at managing a shared icinga2 instance and I suggested doing so in the existing shinken project would be fine.

Tue, Apr 11, 1:28 PM · Shinken, Monitoring, Labs

Mon, Apr 10

chasemp closed T162404: create a 'root' group with bdavis strictly for labs/cloud services infrastructure as "Resolved".

Let's do wmcs-roots. WMCS is our official short form written name for Wikimedia Cloud Services. Cookie licking cloud will make somebody mad at some point down the road.

Mon, Apr 10, 6:56 PM · Ops-Access-Requests, Labs, Operations
chasemp added a comment to T162404: create a 'root' group with bdavis strictly for labs/cloud services infrastructure.

Good point @fgiunchedi -- i n purely searchable terms using cs is probably too diminutive and wmcs will be more searchable and unique across contexts.

Mon, Apr 10, 5:08 PM · Ops-Access-Requests, Labs, Operations
chasemp added a comment to T162404: create a 'root' group with bdavis strictly for labs/cloud services infrastructure.

I'm open to what seems best :)

Mon, Apr 10, 5:04 PM · Ops-Access-Requests, Labs, Operations