Page MenuHomePhabricator

Gage (Jeff Gage)
Disabled

Projects (9)

User Details

User Since
Oct 24 2014, 11:27 PM (264 w, 1 d)
Roles
Disabled
LDAP User
Gage
MediaWiki User
Unknown

Recent Activity

Jun 11 2018

Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV83c1cdf5e6f7: Update patch set 11 (authored by Gage).
Update patch set 11
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV01e6b9ccc5b4: Update patch set 10 (authored by Gage).
Update patch set 10
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV90de6d609e7a: Update patch set 9 (authored by Gage).
Update patch set 9
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV479576894fba: Update patch set 8 (authored by Gage).
Update patch set 8
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV55ec0e0187e3: Update patch set 7 (authored by Gage).
Update patch set 7
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV5e10c333317b: Update patch set 6 (authored by Gage).
Update patch set 6
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV47007af968a8: Update patch set 5 (authored by Gage).
Update patch set 5
Jun 11 2018, 8:47 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rEQGV2a54d59c9683: Update patch set 4 (authored by Gage).
Update patch set 4
Jun 11 2018, 8:47 PM

Sep 23 2015

Restricted Application updated subscribers of T98481: check_puppetrun: print "agent disabled" reason.

The script source doesn't say so, but I've noticed that it's written by ripienaar. Latest upstream implements this feature:

Sep 23 2015, 7:43 AM · patch-welcome, Operations, Icinga, observability

Aug 3 2015

Nemo_bis awarded T92604: IPSec: roll-out plan a Yellow Medal token.
Aug 3 2015, 8:06 PM · Operations, Patch-For-Review, Interdatacenter-IPsec

Jul 22 2015

Gage closed T100478: Labs homedirs owned by root for new projects as Resolved.

Yeah I haven't seen recurrence of this so I'm closing the ticket. Thanks.

Jul 22 2015, 5:50 PM · Cloud-Services

Jul 21 2015

Gage added a member for Wikidata-Query-Service: Gage.
Jul 21 2015, 10:13 PM
Gage committed rOPUPa4e0947d0347: Symlink .wars for git-fat in Archiva too (authored by Ottomata).
Symlink .wars for git-fat in Archiva too
Jul 21 2015, 9:07 PM
Gage committed rOPUPed6350c29ec5: Fix package_dir dependency for wdqs (authored by Stanislav Malyshev <smalyshev@gmail.com>).
Fix package_dir dependency for wdqs
Jul 21 2015, 8:21 PM

Jul 16 2015

Gage added a comment to T105705: Evaluate traffic flow between the Jobrunners and the Cirrus cluster.

TLDR: the rough estimate is about 32Mbit/sec from jobrunners to elasticsearch nodes. Traffic is bursty so I advise planning for a 50-60Mbit ceiling.

Jul 16 2015, 8:59 PM · Operations

Jul 13 2015

Gage claimed T105705: Evaluate traffic flow between the Jobrunners and the Cirrus cluster.
Jul 13 2015, 4:42 PM · Operations

Jul 10 2015

Gage added a comment to T98173: install/setup/deploy server rhodium as puppetmaster (scaling out).

Regarding running puppetmaster on !Precise: when I tried with Trusty I got this:

Error: Could not retrieve catalog from remote server: Error 400 on SERVER: stack level too deep

It seems to be due to a bad interaction between Rails and Activerecord (deb: ruby-activerecord-3.2), this ticket proposes some workarounds:
https://projects.puppetlabs.com/issues/9290

Jul 10 2015, 7:58 PM · Puppet, Puppet-infrastructure-modernization, Operations, Patch-For-Review
Restricted Application updated subscribers of T98173: install/setup/deploy server rhodium as puppetmaster (scaling out).
Jul 10 2015, 7:34 PM · Puppet, Puppet-infrastructure-modernization, Operations, Patch-For-Review
Restricted Application updated subscribers of T98128: Scale up and out our puppetmaster infrastructure.
Jul 10 2015, 7:33 PM · codfw-rollout, codfw-rollout-Jan-Mar-2016, Operations

Jul 9 2015

Gage added a project to T102397: icinga log rotation wipes out portions of history: observability.
Jul 9 2015, 5:49 PM · Operations, observability
Gage added a project to T102394: Implement pybal pool state monitoring and alerting via icinga: observability.
Jul 9 2015, 5:48 PM · Operations, Patch-For-Review, Pybal, observability, Traffic
Restricted Application updated subscribers of T102394: Implement pybal pool state monitoring and alerting via icinga.
Jul 9 2015, 5:48 PM · Operations, Patch-For-Review, Pybal, observability, Traffic
Gage committed rOPUP02fb41780999: Add flag --all-projects to projectviews aggregator (authored by mforns).
Add flag --all-projects to projectviews aggregator
Jul 9 2015, 4:57 PM
Gage added a comment to T93776: remove ganglia(old), replace with ganglia_new.

logstash1001-1003: These hosts are older than 1004-1006, and run Precise instead of Jessie. Gmond wouldn't stop or start.

gage@logstash1002:~$ sudo /usr/sbin/gmond -f
[apache_status] Received the following parameters
{'url': 'http://127.0.0.1:80/server-status', 'collect_ssl': 'False', 'metric_group': 'apache'}
Fatal Python error: PyThreadState_Get: no current thread
Jul  9 00:06:37 logstash1002 kernel: [13984018.998725] init: ganglia-monitor main process (14196) killed by ABRT signal
Jul  9 00:06:37 logstash1002 kernel: [13984018.998751] init: ganglia-monitor main process ended, respawning
Jul  9 00:06:37 logstash1002 kernel: [13984019.086671] init: ganglia-monitor main process (14210) killed by ABRT signal
Jul  9 00:06:37 logstash1002 kernel: [13984019.086698] init: ganglia-monitor main process ended, respawning

The above error messages were unhelpful, so I used strace -f /usr/sbin/gmond -f, saw that gmond parses /etc/ganglia/conf.d/* before aborting, and then did a binary search to remove files in conf.d/ until the problem disappeared.

Jul 9 2015, 12:56 AM · Operations, observability, Patch-For-Review

Jul 1 2015

Gage committed rOPUP6bd23fd1a2bb: Revert "puppetmaster: fix puppet.conf for new CA cert" (authored by Gage).
Revert "puppetmaster: fix puppet.conf for new CA cert"
Jul 1 2015, 8:39 PM
Gage added a reverting change for rOPUP7c17d4e1eea2: puppetmaster: fix puppet.conf for new CA cert: rOPUP6bd23fd1a2bb: Revert "puppetmaster: fix puppet.conf for new CA cert".
Jul 1 2015, 8:39 PM
Gage committed rOPUP7c17d4e1eea2: puppetmaster: fix puppet.conf for new CA cert (authored by Gage).
puppetmaster: fix puppet.conf for new CA cert
Jul 1 2015, 6:14 PM

Jun 29 2015

Gage committed rLPRI2fccab96ae6d: Add Gage's key to labs root (authored by Gage).
Add Gage's key to labs root
Jun 29 2015, 1:08 PM
Gage added a comment to T104019: Nested ".d" dirs in /etc/apt/.

Because new images were not built, I tried to work around this myself like so:

sudo mv /etc/apt/apt.conf.d/apt.conf.d/* /etc/apt/apt.conf.d/
sudo mv /etc/apt/preferences.d/preferences.d/* /etc/apt/preferences.d/
sudo mv /etc/apt/sources.list.d/sources.list.d/* /etc/apt/sources.list.d/
sudo rmdir /etc/apt/apt.conf.d/apt.conf.d/
sudo rmdir /etc/apt/preferences.d/preferences.d/
sudo rmdir /etc/apt/sources.list.d/sources.list.d/

However that results in this error:

N: Ignoring file 'puppet_base_2.7' in directory '/etc/apt/preferences.d/' as it has an invalid filename extension

So I gave it the proper extension:

sudo mv /etc/apt/preferences.d/puppet_base_2.7{,.pref}

However that installs puppet 2.7.11-1ubuntu2 instead of 3.4.3-1~ubuntu12.04.1. Puppet 2.7's version of the 'exec' type doesn't support the 'umask' attribute, resulting in this cryptic error when I apply role::puppet::self:

err: Failed to apply catalog: Invalid parameter umask at /etc/puppet/modules/git/manifests/clone.pp:147

Solution:

sudo rm /etc/apt/preferences.d/puppet_base_2.7.pref
Jun 29 2015, 6:54 AM · Patch-For-Review, Cloud-Services

Jun 26 2015

Gage added a comment to T104019: Nested ".d" dirs in /etc/apt/.

I'm surprised to report that this is still happening on instances created after the patch was merged. I tried twice.

Jun 26 2015, 10:39 PM · Patch-For-Review, Cloud-Services
Gage updated the task description for T104019: Nested ".d" dirs in /etc/apt/.
Jun 26 2015, 6:46 PM · Patch-For-Review, Cloud-Services
Gage created T104019: Nested ".d" dirs in /etc/apt/.
Jun 26 2015, 6:43 PM · Patch-For-Review, Cloud-Services

Jun 18 2015

Restricted Application added a project to T101199: Wikitech often loses track of internal openstack/nova session: Cloud-Services.
Jun 18 2015, 12:21 AM · MW-1.27-release (WMF-deploy-2016-02-09_(1.27.0-wmf.13)), Patch-For-Review, Cloud-Services, wikitech.wikimedia.org, MediaWiki-extensions-OpenStackManager

Jun 14 2015

Gage added a comment to T92618: Ganglia broken for labstore1001 (again).

This is broken again since June 8:

Jun 14 2015, 6:29 PM · Labs-Sprint-102, Cloud-Services

Jun 8 2015

Gage committed rOPUP301cccb4f5ca: strongswan: switch back to auto=start (authored by Gage).
strongswan: switch back to auto=start
Jun 8 2015, 11:24 PM
Gage committed rOPUP95c41f4d371c: ipsec role: regsubst syntax fixup (authored by BBlack).
ipsec role: regsubst syntax fixup
Jun 8 2015, 10:40 PM
Gage committed rOPUP5bf85bf1cd56: ipsec: deploy to cp3030 + cp1065 (text caches) (authored by Gage).
ipsec: deploy to cp3030 + cp1065 (text caches)
Jun 8 2015, 10:31 PM
Gage committed rOPUPd1518b57f232: strongswan: auto=route, reduce logging, rm deprecated vars (authored by Gage).
strongswan: auto=route, reduce logging, rm deprecated vars
Jun 8 2015, 9:25 PM
Gage committed rOPUP339a44600abe: ipsec: remove cp1008 (authored by Gage).
ipsec: remove cp1008
Jun 8 2015, 5:49 PM
Gage closed T96111: Strongswan: security association reauthentication failure as Resolved.

Strongswan 5.3.0-1+wmf2 is currently in our apt repo. I'll make a separate task for config values.

Jun 8 2015, 4:59 PM · Operations, Patch-For-Review, Interdatacenter-IPsec

Jun 5 2015

Gage committed rOPUP8f99c80ae984: puppet_certname: qualify var (authored by Matanya).
puppet_certname: qualify var
Jun 5 2015, 3:23 PM
Gage closed T92603: Monitor IPsec status as Resolved.
Jun 5 2015, 3:22 PM · Operations, Patch-For-Review, Interdatacenter-IPsec, observability

Jun 4 2015

Gage reopened T97380: analytics1013 crashed, investigate... as "Open".

This machine crashed again. All the errors are on socket 0, so we should probably replace that DIMM.

Jun 4 2015, 9:55 PM · Operations, Analytics

Jun 3 2015

Gage committed rOPUP3faba9b9b4ec: strongswan: qualify var (authored by Matanya).
strongswan: qualify var
Jun 3 2015, 9:28 PM
Gage committed rOPUPdcac665ef8c2: strongswan: fqdn is a fact, qualify (authored by Matanya).
strongswan: fqdn is a fact, qualify
Jun 3 2015, 4:35 PM

Jun 2 2015

Gage added a comment to T100959: graphite2001 bios config issue.

a) no grub prompt
b) yes, I see kernel output
c) yes, I see getty on the serial port.

Jun 2 2015, 11:39 PM · Operations
Gage added a comment to T78616: Fix syslog error "nslcd[29117]: error writing to client: Broken pipe".

There's also a Debian bug report discussing this: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=685504

Jun 2 2015, 10:35 PM · LDAP, cloud-services-team (Kanban), Cloud-VPS

Jun 1 2015

Gage updated subscribers of T100954: Wikitech: update Bacula article.
Jun 1 2015, 11:18 PM · Operations, Documentation
Gage committed rOPUPef82b2adecaa: strongswan module: don't install ipsec-tools (authored by Gage).
strongswan module: don't install ipsec-tools
Jun 1 2015, 5:38 PM

May 31 2015

Gage created T100959: graphite2001 bios config issue.
May 31 2015, 10:46 PM · Operations
Gage created T100954: Wikitech: update Bacula article.
May 31 2015, 9:01 PM · Operations, Documentation

May 27 2015

Gage committed rOPUPbaf6b74d21bd: puppetmaster::autosigner: fix doc generation for class (authored by Gage).
puppetmaster::autosigner: fix doc generation for class
May 27 2015, 5:27 PM
Gage added a comment to T99845: analytics1036 can't talk cross row?.

I discussed this problem with a friend in neteng at Twitter, who says he has seen similar behavior in Juniper switches before. He recommends, and I agree: let's reboot the switch (asw-d2-eqiad).

May 27 2015, 1:47 AM · Operations, ops-eqiad
Gage renamed T100478: Labs homedirs owned by root for new projects from Labs homedirs owned by root for new instances to Labs homedirs owned by root for new projects.
May 27 2015, 12:14 AM · Cloud-Services
Gage created T100478: Labs homedirs owned by root for new projects.
May 27 2015, 12:13 AM · Cloud-Services

May 26 2015

Gage added projects to T82576: Enable STARTTLS (both inbound and outbound) on lists: Mail, Wikimedia-Mailing-lists.
May 26 2015, 11:55 PM · Operations, Wikimedia-Mailing-lists, Mail

May 22 2015

Gage added a comment to T99845: analytics1036 can't talk cross row?.

Also let's try to ensure the NIC gets rebooted. My expectation is that racadm serveraction powercycle will do it. I've seen NICs get into weird states that persisted across warm reboots before.

May 22 2015, 7:51 PM · Operations, ops-eqiad
Gage added a comment to T99845: analytics1036 can't talk cross row?.

I recommend checking all bios settings against 1035 for accidental changes (I can't imagine what setting would cause this, but we might as well rule it out), which will also result in rebooting the box to confirm that this behavior is reproducible.

May 22 2015, 7:45 PM · Operations, ops-eqiad
Gage committed rOPZK3272c7cc8dbe: zookeeper-cleanup: don't generate cron email for normal operation (authored by Gage).
zookeeper-cleanup: don't generate cron email for normal operation
May 22 2015, 6:34 PM
Gage committed rOPUPbc0eafde98de: merge zookeeper submodule update I1fa0312aedf57c05dfd326b253b9f732abd4c20b (authored by Gage).
merge zookeeper submodule update I1fa0312aedf57c05dfd326b253b9f732abd4c20b
May 22 2015, 4:18 PM

May 20 2015

Gage added a comment to T99845: analytics1036 can't talk cross row?.

This is mysterious. I've compared the problem host, analytics1036 (ge-2/0/5), with healthy analytics1035 (ge-2/0/4), which sits right next to it in rack D-2.

  • Confirmed the problem:
    • Neon (row A), iron (row B), and analytics1029 (row C) can ping analytics1035 but not analytics1036.
    • Other analytics hosts in row D (analytics1035 in D-2, analytics1041 in D-4) can ping analytics1036.
    • None of the 12 other analytics hosts rebooted today have this problem
    • It's possible to reach the affected host from another host in the same rack. ssh -A through iron -> analytics1035 -> analytics1036 works.
  • Same running kernel
  • Both hosts rebooted today
  • ip addr show output looks the same: same netmasks etc.
  • netstat -rn looks the same
  • /etc/network/interfaces looks the same
  • lldpctl output looks the same
  • On asw-d-eqiad, show interfaces for ge-2/0/4 and ge-2/0/5 look the same and show vlans analytics1-d-eqiad shows both ports as members
    • The only thing that looks odd to me is this part of the running config which seems to have some redundancy and doesn't treat all ports consistently (2/0/3 and 2/0/7 not in 'member-range', 4/0/* not in 'member', but those are not the ports in question!):
interface-range vlan-analytics1-d-eqiad {
    member "ge-2/0/[0-3]";
    member "ge-2/0/[4-6]";
    member ge-2/0/7;
    member-range ge-2/0/0 to ge-2/0/2;
    member-range ge-2/0/4 to ge-2/0/6;
    member-range ge-4/0/1 to ge-4/0/4;
    unit 0 {
        family ethernet-switching {
            vlan {
                members analytics1-d-eqiad;
            }
        }
    }
}
May 20 2015, 11:19 PM · Operations, ops-eqiad
Gage triaged T99833: Puppet function: ipresolve: throw an error if lookup fails, refactor into wmflib as Normal priority.
May 20 2015, 9:13 PM · Operations, Patch-For-Review, Interdatacenter-IPsec, Puppet
Gage created T99833: Puppet function: ipresolve: throw an error if lookup fails, refactor into wmflib.
May 20 2015, 9:08 PM · Operations, Patch-For-Review, Interdatacenter-IPsec, Puppet

May 19 2015

Gage committed rOPUPf805e5233480: varnishkafka/statsv: set instance name to 'frontend' on text (authored by ori).
varnishkafka/statsv: set instance name to 'frontend' on text
May 19 2015, 11:52 PM
Gerrit Code Review <gerrit@wikimedia.org> committed rOPUPc39d91284f59: Merge "varnishkafka: Drop trailing '\?' from the RxURL filter for statsv"… (authored by Gage).
Merge "varnishkafka: Drop trailing '\?' from the RxURL filter for statsv"…
May 19 2015, 11:32 PM
Gage updated subscribers of T98161: Build Kafka 0.8.1.1 package for Jessie and upgrade Brokers to Jessie..
May 19 2015, 3:05 AM · Operations, Analytics-Cluster
Gage added a comment to T98161: Build Kafka 0.8.1.1 package for Jessie and upgrade Brokers to Jessie..

I followed this: https://git.wikimedia.org/blob/operations%2Fpuppet.git/2cdd08f9686b040816bd0dd8e63e712f4b084a7a/modules%2Fpackage_builder%2FREADME.md

May 19 2015, 3:04 AM · Operations, Analytics-Cluster
Gage closed T88536: Implement a big IPsec off switch as Resolved.

Deployed, tested, documented: https://wikitech.wikimedia.org/wiki/IPsec#Emergency_shutdown

May 19 2015, 1:59 AM · Operations, Patch-For-Review, Interdatacenter-IPsec
Gage added a comment to T98620: Degraded RAID-1 arrays on new logstash hosts: [UU__].

Patch: https://gerrit.wikimedia.org/r/#/c/211931/

May 19 2015, 1:55 AM · Operations, Patch-For-Review
Gage lowered the priority of T98620: Degraded RAID-1 arrays on new logstash hosts: [UU__] from Unbreak Now! to Normal.
May 19 2015, 1:49 AM · Operations, Patch-For-Review

May 15 2015

Gage added a member for Traffic: Gage.
May 15 2015, 4:06 PM

May 14 2015

Gerrit Code Review <gerrit@wikimedia.org> committed rOPUPf4363e67d3b4: Merge "puppetmaster: Do not manage certmanager's home" into production (authored by Gage).
Merge "puppetmaster: Do not manage certmanager's home" into production
May 14 2015, 12:15 AM

May 13 2015

Gage committed rOPUPc7f75509482d: Fix logster tracking for CirrusSearch-slow.log (authored by bd808).
Fix logster tracking for CirrusSearch-slow.log
May 13 2015, 10:31 PM
Gage added a comment to T98620: Degraded RAID-1 arrays on new logstash hosts: [UU__].

Eventually I'd like to see the partman receipe fixed and tested by reinstalling one of these hosts, but I've fixed the running config so that the arrays no longer appear as degraded:

gage@logstash1006:~$ cat /proc/mdstat
Personalities : [raid1] [raid0]
md0 : active raid1 sda2[0] sdb2[1]
      249869312 blocks super 1.2 [4/2] [UU__]
      bitmap: 2/2 pages [8KB], 65536KB chunk
May 13 2015, 5:02 PM · Operations, Patch-For-Review

May 11 2015

Gage committed rOPUP0e03cf36b9f9: add deployer admin groups to codfw deploy server (authored by Dzahn).
add deployer admin groups to codfw deploy server
May 11 2015, 5:54 PM

May 8 2015

Gage triaged T84907: Kafka logging to Logstash as Low priority.
May 8 2015, 5:59 PM · observability, Analytics, Wikimedia-Logstash
Gage triaged T84908: Zookeeper logging to Logstash as Low priority.
May 8 2015, 5:58 PM · observability, Analytics-Engineering, Wikimedia-Logstash
Gage updated the task description for T98620: Degraded RAID-1 arrays on new logstash hosts: [UU__].
May 8 2015, 5:51 PM · Operations, Patch-For-Review
Gage created T98620: Degraded RAID-1 arrays on new logstash hosts: [UU__].
May 8 2015, 5:21 PM · Operations, Patch-For-Review

May 7 2015

Gage added a comment to T97411: Build a non-trunk 3.19 kernel for jessie.

This kernel is now installed on berkelium & curium.

  • IPsec ESNs work (fixed in 3.19.3)
  • Aesni security patch for CVE-2015-3331 is included (fixed in 3.19.3)
  • Aes256gcm does not work. (fixed in 4.0, but we don't care because we plan to use aes128gcm which works in 3.19.)
May 7 2015, 6:03 PM · Operations, Patch-For-Review, Traffic
Gage created T98488: rsyslog: use high precision timestamps or explain why not.
May 7 2015, 3:06 PM · Operations
Gage added a comment to T98481: check_puppetrun: print "agent disabled" reason.

Relatedly, I have learned that the reason must be quoted or only get the first word is stored:

May 7 2015, 2:46 PM · patch-welcome, Operations, Icinga, observability
Gage created T98481: check_puppetrun: print "agent disabled" reason.
May 7 2015, 2:41 PM · patch-welcome, Operations, Icinga, observability
Gage committed rOPUP5f7494f48c3c: IPsec: Icinga monitor for Strongswan connections (authored by Gage).
IPsec: Icinga monitor for Strongswan connections
May 7 2015, 1:24 AM

May 6 2015

Gage added a comment to T92601: Migrate host lists out of cache.pp to reference values in Hiera.

manifests/role/cache.pp has been refactored into modules/role/manifests/cache/* which reference hieradata/common/cache/*, hence the redundant data described in this task is eliminated.

May 6 2015, 6:37 PM · Traffic, Operations
Gage changed the status of T94320: Improve monitoring of https://git.wikimedia.org/ from Open to Stalled.
May 6 2015, 6:00 PM · Operations, Gitblit, observability
Gage changed the status of T87840: Retire Torrus from Open to Stalled.
May 6 2015, 6:00 PM · observability, Operations, Technical-Debt
Gage removed a subtask for T81543: Enable IPSec between datacenters: T85823: IPsec: add firewall rules.
May 6 2015, 5:54 PM · Operations, Traffic, Interdatacenter-IPsec
Gage removed a parent task for T85823: IPsec: add firewall rules: T81543: Enable IPSec between datacenters.
May 6 2015, 5:54 PM · Operations, Interdatacenter-IPsec
Gage closed Restricted Task, a subtask of T81543: Enable IPSec between datacenters, as Declined.
May 6 2015, 5:53 PM · Operations, Traffic, Interdatacenter-IPsec
Gage updated the task description for T82698: shutdown sodium after mailman has migrated to jessie VM.
May 6 2015, 5:49 PM · Operations

May 5 2015

Gage committed rOPUP7ea199c136d8: ipsec-global: fix bug in non-verbose mode, exit if not root (authored by Gage).
ipsec-global: fix bug in non-verbose mode, exit if not root
May 5 2015, 4:30 PM

May 4 2015

Gage added a comment to T81543: Enable IPSec between datacenters.

Thanks, Brandon. I'll reply in order:

May 4 2015, 5:52 PM · Operations, Traffic, Interdatacenter-IPsec
Gage added a comment to T96111: Strongswan: security association reauthentication failure.

To summarize remaining work:

  • Strongswan 5.3.0 is needed but is currently only in Experimental. It won't be coming to Jessie so it needs to be imported to WMF's apt repo.
  • Determine appropriate values for prod: lifetime, margin, auto.
May 4 2015, 5:52 PM · Operations, Patch-For-Review, Interdatacenter-IPsec
Gage updated the task description for T92604: IPSec: roll-out plan.
May 4 2015, 5:52 PM · Operations, Patch-For-Review, Interdatacenter-IPsec
Gage closed Restricted Task, a subtask of T81543: Enable IPSec between datacenters, as Declined.
May 4 2015, 5:43 PM · Operations, Traffic, Interdatacenter-IPsec
Gage committed rOPUPebaf2ab7cc42: logstash: Seed Elasticsearch cluster host (authored by bd808).
logstash: Seed Elasticsearch cluster host
May 4 2015, 4:40 PM

May 3 2015

Gage added a comment to T94417: Fix ipv6 autoconf issues.

The token-based solution (Proposal 1) sounds good to me; it seems like the only barrier to adoption is making a policy decision to go with a proposal which doesn't support Precise, correct?

May 3 2015, 9:23 PM · Operations, Patch-For-Review, Interdatacenter-IPsec

Apr 23 2015

Gage committed rOPUP93edfcf13308: logstash: Convert $::realm switches to hiera (authored by bd808).
logstash: Convert $::realm switches to hiera
Apr 23 2015, 10:10 PM
Gage committed rOPUPd4d4c4b50c45: logstash: Remove redis input (authored by bd808).
logstash: Remove redis input
Apr 23 2015, 9:45 PM