Page MenuHomePhabricator

jbond (John Bond)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Jan 7 2019, 1:06 PM (22 w, 6 d)
Availability
Available
IRC Nick
jbond42
LDAP User
Jbond
MediaWiki User
JBond (WMF) [ Global Accounts ]

Recent Activity

Fri, Jun 7

jbond added a comment to T225278: Installation failing on late_command.sh .

Awesome, ill keep the ticket open so people now im playing with the other two servers

Fri, Jun 7, 10:07 AM · Operations
jbond added a comment to T225278: Installation failing on late_command.sh .

Sorry for this i have just pushed https://gerrit.wikimedia.org/r/c/operations/puppet/+/515017 so this should only be broken for sarin and neodyium now, sorry for the interruption and ping me if you still see issues

Fri, Jun 7, 9:41 AM · Operations

Thu, Jun 6

jbond added a comment to T220504: Decommission sarin.

going to re-image this server to stretch, testing changes to late_command.sh

Thu, Jun 6, 6:56 PM · Operations, decommission, ops-codfw
jbond added a comment to T220503: Decommission neodymium.

im going to reimage this server to test the following change
https://gerrit.wikimedia.org/r/c/operations/puppet/+/514689

Thu, Jun 6, 12:43 PM · decommission, Operations, ops-eqiad
jbond created P8594 (An Untitled Masterwork).
Thu, Jun 6, 11:35 AM

Wed, Jun 5

jbond triaged T225108: Prometheus logs showing errors for routinator as Normal priority.
Wed, Jun 5, 3:33 PM · observability, netops, Operations

Tue, Jun 4

jbond closed T223938: facter 3: add timeout to custom facts external calls as Resolved.

A CR for this has been deployed so the spam should be gone. please reopen if spam persists

Tue, Jun 4, 10:16 AM · Patch-For-Review, Puppet, Operations
jbond closed T223938: facter 3: add timeout to custom facts external calls, a subtask of T219803: upgrade facter and puppet across the fleet, as Resolved.
Tue, Jun 4, 10:16 AM · Patch-For-Review, Packaging, Puppet, Operations

Mon, Jun 3

jbond added a comment to T223938: facter 3: add timeout to custom facts external calls.

I looked at this a bit more today and my initial analysts was wrong. the facts do actually resolve they just take longer when there are disk issues. Further i was unable to find a command to test if we have this issues that would be generic enough to apply across the fleet. As such for now i have created a patch to increase the timeout values but I'm definitely open to better suggestions

Mon, Jun 3, 3:54 PM · Patch-For-Review, Puppet, Operations

Thu, May 30

jbond created P8574 (An Untitled Masterwork).
Thu, May 30, 1:25 PM
jbond created P8573 (An Untitled Masterwork).
Thu, May 30, 12:31 PM

Tue, May 28

jbond triaged T224477: rhenium [spare] server still receiving flow data as Normal priority.
Tue, May 28, 2:04 PM · Traffic, Operations

Fri, May 24

jbond added a comment to T223835: Configure wikimedia.org to enable *:wikimedia.org Matrix user IDs.

ack, thanks for the clarification

Fri, May 24, 11:51 AM · Patch-For-Review, Traffic, DNS, Wikimedia-Apache-configuration, Operations, Matrix

Thu, May 23

jbond added a comment to T220669: RPKI Validation.

just watching ripe presentation and thought this may be of interest https://ripe78.ripe.net/archives/video/106

Thu, May 23, 3:42 PM · Operations, netops

Wed, May 22

jbond created P8556 (An Untitled Masterwork).
Wed, May 22, 5:22 PM
jbond added a comment to T223835: Configure wikimedia.org to enable *:wikimedia.org Matrix user IDs.

@Tgr while reviewing the change created by volans i noticed that currently wikimedia.modular.im. dose not exist. We should ensure this exists and belongs to the wikimedia foundation before adding the authorization. further having followed the links it looks like the hosted service from modular.im would actually be a riot.im box and would have the name wikimedia.riot.im

Wed, May 22, 12:48 PM · Patch-For-Review, Traffic, DNS, Wikimedia-Apache-configuration, Operations, Matrix

Tue, May 21

jbond added a comment to T223938: facter 3: add timeout to custom facts external calls.

I have done a bit of testing with the lvm_vgs. under the hood this command runs vgs -o name --noheadings, if i run this from the command line and time it i get the following

Tue, May 21, 3:38 PM · Patch-For-Review, Puppet, Operations

May 15 2019

jbond created P8532 nmap.
May 15 2019, 5:48 PM
jbond updated subscribers of T144169: Flake8 for python files without extension in puppet repo.

have created changes for pcc and archive-project-volumes

May 15 2019, 12:24 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, Operations-Software-Development
jbond added a comment to T144169: Flake8 for python files without extension in puppet repo.

utils/pcc

left this till last as im not sure how/where its called

May 15 2019, 10:12 AM · cloud-services-team (Kanban), Patch-For-Review, Operations, Operations-Software-Development

May 14 2019

jbond added a comment to T219803: upgrade facter and puppet across the fleet.

after speching with alex it seems that facter2 set ipaddress6 to undef if there where only link-local adrdesses however in facter3 it has the value of the link-local address (however more investigation required)

git/puppet [ grep -r ipaddress6 ./modules                         review/jbond/update_python ] 9:57 PM
./modules/base/lib/facter/interface_primary.rb:Facter.add('ipaddress6') do
./modules/base/lib/facter/interface_primary.rb:    # Do not rely on ipaddress6_#{interface_primary}, as its underlying
./modules/interface/manifests/add_ip6_mapped.pp:    $ipv6_address = inline_template("<%= require 'ipaddr'; (IPAddr.new(scope.lookupvar(\"::ipaddress6_${interface}\")).mask(64) | IPAddr.new(@v6_mapped_lower64)).to_s() %>")
./modules/profile/templates/exim/exim4.conf.mailman.erb:        interface = <; <%= @ipaddress %> ; <%= @ipaddress6 %>
./modules/profile/templates/exim/exim4.conf.mailman.erb:        interface = <; <%= @ipaddress %> ; <%= @ipaddress6 %>
./modules/profile/manifests/dnsrecursor.pp:            $facts['ipaddress6'],
./modules/profile/manifests/dnsrecursor.pp:    ::dnsrecursor::monitor { [ $facts['ipaddress'], $facts['ipaddress6'] ]: }
./modules/profile/manifests/pybal.pp:        'bgp-nexthop-ipv6'    => inline_template("<%= require 'ipaddr'; (IPAddr.new(@ipaddress6).mask(64) | IPAddr.new(\"::\" + @ipaddress.gsub('.', ':'))).to_s() %>"),
./modules/profile/manifests/openstack/base/pdns/auth/service.pp:        dns_auth_ipaddress6    => $facts['ipaddress6'],
./modules/ssh/manifests/server.pp:    if $::ipaddress6 == undef {
./modules/ssh/manifests/server.pp:        $aliases = [ $::hostname, $::ipaddress, $::ipaddress6 ]
./modules/standard/spec/default_module_facts.yml:ipaddress6: 2001:db8::42
./modules/calico/templates/initscripts/calico-node.systemd.erb:  -e IP6=<%= @ipaddress6 %> \
./modules/pdns_server/templates/pdns.conf.erb:<% if @dns_auth_ipaddress6 then %>local-ipv6=<%= @dns_auth_ipaddress6 %><% end %>
./modules/pdns_server/manifests/init.pp:# - $dns_auth_ipaddress6:IPv6 address PowerDNS will bind to and send packets from
./modules/pdns_server/manifests/init.pp:    $dns_

not to check theses code points

May 14 2019, 8:59 PM · Patch-For-Review, Packaging, Puppet, Operations
jbond updated subscribers of T144169: Flake8 for python files without extension in puppet repo.

As i was on clinic duty last week i decided to have a crack at doing the manual work to rename everything in the repo so that they can be checked by CI. Thanks to all the reviewers i have mostly got that complete with only two changes still wanting review. I have also created a change to add a CI check to ensure python files have the correct extension. It would be good to get the latter committed asap to prevent more files from being added.

May 14 2019, 6:42 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, Operations-Software-Development
jbond closed T223264: Degraded RAID on es2004 as Invalid.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:25 PM · Operations, ops-codfw
jbond closed T223246: Degraded RAID on kafka1014 as Invalid.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:25 PM · ops-eqiad, Operations
jbond closed T223249: Degraded RAID on backup2001 as Resolved.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:24 PM · Operations, ops-codfw
jbond closed T223250: Degraded RAID on kafka1012 as Resolved.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:24 PM · ops-eqiad, Operations
jbond closed T223251: Degraded RAID on ms-be1043 as Resolved.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:24 PM · ops-eqiad, Operations
jbond closed T223253: Degraded RAID on tungsten as Resolved.
May 14 2019, 6:24 PM · ops-eqiad, Operations
jbond added a comment to T223253: Degraded RAID on tungsten.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:23 PM · ops-eqiad, Operations
jbond closed T223257: Degraded RAID on es2002 as Resolved.

Caused by https://gerrit.wikimedia.org/r/c/operations/puppet/+/508855 which has now been rolled back

May 14 2019, 6:23 PM · Operations, ops-codfw
jbond added a comment to T223276: keyholder has just disarmed everywhere (train blocker).

Confirm that is responsible for the faliure, the other re-arms where done by myself

May 14 2019, 1:58 PM · Operations
jbond updated subscribers of T218544: ms-be1043 sdk failed.

The change above failed as the phsicalDrive regex dose not take into account the the number of drives per span. however further investigation shows we have this discrepancy on other servers

May 14 2019, 11:55 AM · observability, Operations-Software-Development, Operations, ops-eqiad
jbond closed T220987: Ferm: send ferm/iptables/ulogd logs to Kafaka/logstash/elasticsearch as Resolved.

logs are now been sent to kafka, however we still need to role the profile::firewall::logging module to all infrastructure which is been tracked in T116011

May 14 2019, 9:14 AM · Patch-For-Review, Wikimedia-Logstash, Security, Operations
jbond closed T220987: Ferm: send ferm/iptables/ulogd logs to Kafaka/logstash/elasticsearch, a subtask of T116011: ferm: Log dropped packets, as Resolved.
May 14 2019, 9:13 AM · Patch-For-Review, Operations

May 10 2019

jbond closed T222864: Requesting access to deployment for Cormac Parle as Resolved.
May 10 2019, 4:21 PM · Patch-For-Review, SRE-Access-Requests, Operations
jbond added a comment to T222864: Requesting access to deployment for Cormac Parle.

This should be complete now please allow up to 30 minutes for puppet to role out the change any issues after that please re-open the ticket

May 10 2019, 4:21 PM · Patch-For-Review, SRE-Access-Requests, Operations
jbond closed T222689: Upload zuul_2.5.1-wmf9 to apt.wikimedia.org as Resolved.

latest package has been uploaded re open if further problems

May 10 2019, 2:57 PM · Patch-For-Review, Operations, Wikimedia-Incident, Zuul, Continuous-Integration-Infrastructure
jbond closed T222689: Upload zuul_2.5.1-wmf9 to apt.wikimedia.org, a subtask of T105474: 'recheck' on a CR+2 patch should trigger gate-and-submit, not test, as Resolved.
May 10 2019, 2:57 PM · Release-Engineering-Team-TODO, Upstream, Zuul, Patch-For-Review, Continuous-Integration-Config
jbond triaged T222950: cloudvirt1006 - RAID battery failed as Normal priority.
May 10 2019, 2:10 PM · cloud-services-team, ops-eqiad, Operations
jbond created T222950: cloudvirt1006 - RAID battery failed.
May 10 2019, 12:58 PM · cloud-services-team, ops-eqiad, Operations
jbond added a comment to T183177: memory errors not showing in icinga.

The correctable errors check has been deployed and it is yielding some results already. Myself and @herron took at the list of hosts and there seem to be a few different "classes" or "states":

  1. high count of CEs and recent kernel messages
  2. low count of CEs and no recent kernel messages

    The course of action is to file tasks for class #1 to diagnose memory and reset edac counters (i.e. reload the edac kernel modules) for class #2 to probe for reoccurences.
May 10 2019, 12:23 PM · Traffic, Patch-For-Review, User-fgiunchedi, DC-Ops, Operations, observability
jbond created P8509 (An Untitled Masterwork).
May 10 2019, 11:12 AM
jbond updated subscribers of T222910: Requesting access to deployment and analytics-privatedata-users for jfishback.

@greg can you please approve jfishback addition to the deployment group

May 10 2019, 10:46 AM · User-greg, SRE-Access-Requests, Operations, Security-Team
jbond triaged T222900: Separate Wikitech cronjobs from production as Normal priority.
May 10 2019, 10:35 AM · Operations, serviceops

May 9 2019

jbond added a comment to T213769: Zero VCL removal.

Ok great thanks for the update

May 9 2019, 3:53 PM · Patch-For-Review, Zero, Traffic, Operations
jbond added a comment to T213769: Zero VCL removal.

I guess the removal needs merging?

May 9 2019, 3:25 PM · Patch-For-Review, Zero, Traffic, Operations
jbond added a comment to T213769: Zero VCL removal.

Is this ticket complete? can it be closed, if not what further actions are required?

May 9 2019, 3:07 PM · Patch-For-Review, Zero, Traffic, Operations
jbond added a comment to T219854: Broken disk on ms-be2026.

So the dsa-check-hpssacli check is happily returning 0 exit code and this output:

OK: Slot 0: no logical drives --- Slot 0: no drives

Given that IIRC we add the HP raid check only on the hosts that have it, we might consider patching this imported script to fails in the case there is a controller but has no drives configured (both no logical and no physical?)

Agreed, seems like a sensible thing to do, also upstream would be interested I think.

May 9 2019, 2:41 PM · Patch-For-Review, Operations, ops-codfw
jbond updated subscribers of T222864: Requesting access to deployment for Cormac Parle.

@greg this is essentially a request to add cparle to the deployment group. As the manager of Release engineering are you the relevent person to approve this access request? Thanks

May 9 2019, 2:08 PM · Patch-For-Review, SRE-Access-Requests, Operations
jbond closed T222794: Please create a private mailing list traffic-anomaly-report as Resolved.
May 9 2019, 2:05 PM · Operations, Wikimedia-Mailing-lists
jbond added a comment to T222794: Please create a private mailing list traffic-anomaly-report.

I have created the list and you should have received the admin password. I have set some of the privacy options but the majority of settings have been left to the default values as such please check all config options to ensure the list is configured as desired. Pleas also add a description and update the mailing list page. Let me know if there are any issues

May 9 2019, 2:04 PM · Operations, Wikimedia-Mailing-lists
jbond triaged T222864: Requesting access to deployment for Cormac Parle as Normal priority.
May 9 2019, 1:37 PM · Patch-For-Review, SRE-Access-Requests, Operations
jbond triaged T222788: Request to be added to the ldap/wmde group as Normal priority.
May 9 2019, 1:37 PM · Patch-For-Review, WMF-Legal, LDAP-Access-Requests, Operations, WMF-NDA-Requests
jbond added a comment to T222788: Request to be added to the ldap/wmde group.

Hi @RStallman-legalteam can you confirm NDA status please

May 9 2019, 1:37 PM · Patch-For-Review, WMF-Legal, LDAP-Access-Requests, Operations, WMF-NDA-Requests
jbond added a comment to T219107: Disable list subscription via email also for listname-subscribe@.

Can this task be made public (via "Edit Task > Visible To")?

done

May 9 2019, 1:29 PM · Security, Patch-For-Review, Operations, Wikimedia-Mailing-lists
jbond removed a project from T219107: Disable list subscription via email also for listname-subscribe@: Security.
May 9 2019, 1:28 PM · Security, Patch-For-Review, Operations, Wikimedia-Mailing-lists
jbond triaged T222879: update puppetdb and puppet-master packages to be compatible with puppet5 as Normal priority.
May 9 2019, 12:37 PM · Packaging, Operations, Puppet

May 8 2019

jbond added a comment to T218544: ms-be1043 sdk failed.

@fgiunchedi agree that this is a new issue, and we need to fix two different scripts to have an automatic task created for this:

  1. The check_raid script The current check_raid script is not alarming in this case, and without an alarm the raid handler will never be called:
May 8 2019, 5:49 PM · observability, Operations-Software-Development, Operations, ops-eqiad
jbond added a comment to T222356: facter3: Unable to parse routing table.

correct pull request https://github.com/puppetlabs/facter/pull/1775

May 8 2019, 4:38 PM · Packaging, Puppet, Operations
jbond added a comment to T222356: facter3: Unable to parse routing table.
May 8 2019, 4:37 PM · Packaging, Puppet, Operations
jbond closed T219107: Disable list subscription via email also for listname-subscribe@ as Resolved.

@Aklapper thanks for the explanation i hadn't realised that restriction.

May 8 2019, 3:01 PM · Security, Patch-For-Review, Operations, Wikimedia-Mailing-lists
jbond added a comment to T219107: Disable list subscription via email also for listname-subscribe@.

I have created the following patch not sure why it wasn't auto tagged here
https://gerrit.wikimedia.org/r/c/operations/puppet/+/508820

May 8 2019, 1:18 PM · Security, Patch-For-Review, Operations, Wikimedia-Mailing-lists
jbond added a comment to T219933: parsoid-vd on scandium randomly died.

did this issue get fixed? i just checked scandium and parsoid-vd has been running for 3 weeks

May 8 2019, 12:48 PM · Patch-For-Review, Operations
jbond added a comment to T221288: Phabricator SPF record contains internal addressing for phab[12]001.

Is there any further action for this ticket or can we close it?

May 8 2019, 12:30 PM · Patch-For-Review, Traffic, Operations, DNS, Mail
jbond added a comment to T221784: Puppet failing without Icinga alert in case of dependency cycle.

Are there any more actions required here or can we close this ticket?

May 8 2019, 12:14 PM · Puppet, Icinga, observability, Operations
jbond closed T222198: Gmail - Multiple destination domains per transaction is unsupported. Please try again. as Resolved.

Resolving as this looks fixed, i have checked the logs and the last error i see relating to this is from 2019-05-01 20:51:17 so im pretty sure its fixed but please reopen if im wrong

May 8 2019, 11:51 AM · Patch-For-Review, Mail, Operations
jbond triaged T222755: #wikimedia-sre is missing stashbot as Normal priority.
May 8 2019, 9:21 AM · Stashbot, Operations

May 7 2019

Dzahn awarded T222326: cronspam from smart-data-dump due to facter bug a Like token.
May 7 2019, 5:12 PM · Operations
jbond added a comment to T221529: Frequent puppet failures .

! In T221529#5144319, @crusnov wrote:
Has anyone checked if the 5xx errors happen to coincide with puppet-merge happening?

May 7 2019, 2:55 PM · Puppet, puppet-compiler, Operations
jbond added a comment to T222356: facter3: Unable to parse routing table.

I think we should patch facter once your PR is reviewed/merged upstream to address this for good. But I think it's fine to proceed with the facter rollout given that this is harmless log spam and only affects ~ 10% of our servers (mc and cp hosts).

May 7 2019, 11:10 AM · Packaging, Puppet, Operations
jbond closed T222326: cronspam from smart-data-dump due to facter bug as Resolved.

Resolving this and will track the root problem in https://phabricator.wikimedia.org/T222356

May 7 2019, 10:28 AM · Operations
jbond closed T222326: cronspam from smart-data-dump due to facter bug, a subtask of T219803: upgrade facter and puppet across the fleet, as Resolved.
May 7 2019, 10:28 AM · Patch-For-Review, Packaging, Puppet, Operations
jbond added a comment to T219803: upgrade facter and puppet across the fleet.

FYI the upgrade seems to be generating cronspam, in the form of facter warnings:

Subject: Cron <root@cp5001> /usr/local/sbin/smart-data-dump --syslog --outfile /var/lib/prometheus/node.d/device_smart.prom
2019-04-30 16:34:02.691553 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.0.131 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691678 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.0.133 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691716 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.16.23 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691750 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.16.25 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691784 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.32.68 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691822 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.32.70 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691857 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.48.102 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691890 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.64.48.104 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691929 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.0.123 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691963 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.0.126 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.691996 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.16.134 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692029 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.16.137 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692062 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.32.113 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692095 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.32.116 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692128 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.32.117 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692161 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.48.24 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692195 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.48.26 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692227 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.48.28 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692260 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.48.29 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.692293 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '10.192.48.30 via 10.132.0.1 dev enp5s0f0  mtu lock 1450'
2019-04-30 16:34:02.693333 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:101:10:192:0:123 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693388 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:101:10:192:0:126 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693431 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:102:10:192:16:134 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693469 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:102:10:192:16:137 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693505 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:103:10:192:32:113 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693541 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:103:10:192:32:116 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693576 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:103:10:192:32:117 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693612 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:104:10:192:48:24 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693648 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:104:10:192:48:26 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693684 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:104:10:192:48:28 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693720 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:104:10:192:48:29 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693756 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:860:104:10:192:48:30 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693791 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:101:10:64:0:131 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693827 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:101:10:64:0:133 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693863 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:102:10:64:16:23 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693899 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:102:10:64:16:25 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693941 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:103:10:64:32:68 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.693978 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:103:10:64:32:70 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.694014 WARN  puppetlabs.facter - Could not process routing table entry: Expected a destination followed by key/value pairs, got '2620:0:861:107:10:64:48:102 via fe80::66c3:d602:8bc:c7f1 dev enp5s0f0 metric 1024  mtu lock 1450 pref medium'
2019-04-30 16:34:02.694049 WARN  puppetlabs.facter - Could not process routing table entry: Expected a
May 7 2019, 10:26 AM · Patch-For-Review, Packaging, Puppet, Operations
jbond closed T222689: Upload zuul_2.5.1-wmf9 to apt.wikimedia.org, a subtask of T105474: 'recheck' on a CR+2 patch should trigger gate-and-submit, not test, as Resolved.
May 7 2019, 10:11 AM · Release-Engineering-Team-TODO, Upstream, Zuul, Patch-For-Review, Continuous-Integration-Config
jbond closed T222689: Upload zuul_2.5.1-wmf9 to apt.wikimedia.org as Resolved.
May 7 2019, 10:11 AM · Patch-For-Review, Operations, Wikimedia-Incident, Zuul, Continuous-Integration-Infrastructure
jbond added a comment to T222689: Upload zuul_2.5.1-wmf9 to apt.wikimedia.org.

this has been uploaded let me know if there are any issues

May 7 2019, 10:11 AM · Patch-For-Review, Operations, Wikimedia-Incident, Zuul, Continuous-Integration-Infrastructure

May 3 2019

jbond triaged T222443: cron-spam: /usr/local/sbin/check-cumin-aliases as Normal priority.
May 3 2019, 11:50 AM · Patch-For-Review, Operations
jbond closed T220380: Upload Zuul 2.5.1-wmf7 package to apt.wikimedia.org as Resolved.

I have updated the new package and removed the thirdparty package let me know if you see any issues

May 3 2019, 10:44 AM · Continuous-Integration-Infrastructure, Operations

May 2 2019

jbond added a comment to T222356: facter3: Unable to parse routing table.

The lock is needed to workaround a bug with the kernel/ipsec.
See https://gerrit.wikimedia.org/r/c/operations/puppet/+/437784 and https://phabricator.wikimedia.org/T195365

May 2 2019, 3:26 PM · Packaging, Puppet, Operations
jbond updated subscribers of T222356: facter3: Unable to parse routing table.

This is due to a bug in facter fundamentally cased because there are an even number or words in the output of ip route show. I have proposed a fix upstream and i believe the warning is harmless (in fact my PR removes the warning all together). If we want to work around this issue more we have a number of options

  • patch or facter package (@MoritzMuehlenhoff opinion?)
  • update modules/interface/manifests/route.pp to remove the lock keyword. lock stops the kernel from updating the MTU via PTMU. I'm unsure if that may cause issues @ayounsi could likely add more information here?
May 2 2019, 2:27 PM · Packaging, Puppet, Operations
jbond added a comment to T222326: cronspam from smart-data-dump due to facter bug.

i have added a plaster to the smart-data-dump to stop the spam and will investigate the underlining issues further via T222326

May 2 2019, 10:25 AM · Operations
jbond triaged T222356: facter3: Unable to parse routing table as Normal priority.
May 2 2019, 10:25 AM · Packaging, Puppet, Operations
jbond added a subtask for T219803: upgrade facter and puppet across the fleet: T222326: cronspam from smart-data-dump due to facter bug.
May 2 2019, 10:19 AM · Patch-For-Review, Packaging, Puppet, Operations
jbond added a parent task for T222326: cronspam from smart-data-dump due to facter bug: T219803: upgrade facter and puppet across the fleet.
May 2 2019, 10:19 AM · Operations

Apr 30 2019

jbond edited P8459 deployment plan.
Apr 30 2019, 2:22 PM
jbond edited P8459 deployment plan.
Apr 30 2019, 2:22 PM
jbond edited P8459 deployment plan.
Apr 30 2019, 2:17 PM
jbond edited P8459 deployment plan.
Apr 30 2019, 2:13 PM
jbond edited P8459 deployment plan.
Apr 30 2019, 2:12 PM
jbond edited P8459 deployment plan.
Apr 30 2019, 2:11 PM
jbond created P8459 deployment plan.
Apr 30 2019, 2:01 PM
jbond triaged T222160: facter3: use structured facts as Normal priority.
Apr 30 2019, 11:21 AM · Puppet, Operations
jbond added a comment to T219803: upgrade facter and puppet across the fleet.

One thing that will need to be fixed is the detection of HP machines to install 'hp-health' in modules/base/manifests/standard_packages.pp:L141, unfortunately it seems 'dmi' isn't in 2.4 yet, so this will also need to be conditional on facter 2/3.

Apr 30 2019, 11:20 AM · Patch-For-Review, Packaging, Puppet, Operations
jbond created P8458 latest.
Apr 30 2019, 10:57 AM

Apr 29 2019

jbond added a comment to T221529: Frequent puppet failures .

Have not had time to look at this in depth yet however i did just notice an issue while applying a refactor change[1]

Apr 29 2019, 3:25 PM · Puppet, puppet-compiler, Operations
jbond added a comment to T130883: decom cp3011-22 (12 machines).

intend to get this in a second but in case i forget the following DNS entries need cleaning up

Apr 29 2019, 12:23 PM · Patch-For-Review, decommission, ops-esams, Operations

Apr 26 2019

jbond closed T222000: test please ignore as Invalid.
Apr 26 2019, 9:21 PM
jbond created T222000: test please ignore.
Apr 26 2019, 9:21 PM
jbond added a comment to T116011: ferm: Log dropped packets.

Looking at cumin1001 I noticed that the log prefix at the end of the input chan is "fw-out-drop" and the output chain is empty with an accept policy. Is "out" indeed the direction in this case? Or would dropped packets logged by the input chain be considered "in"?

Apr 26 2019, 10:25 AM · Patch-For-Review, Operations

Apr 23 2019

jbond created P8430 (An Untitled Masterwork).
Apr 23 2019, 5:06 PM