Page MenuHomePhabricator

Vgutierrez (Valentín Gutiérrez)
Traffic Security Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Feb 12 2018, 9:51 AM (70 w, 1 d)
Availability
Available
IRC Nick
vgutierrez
LDAP User
Vgutierrez
MediaWiki User
Unknown

Recent Activity

Yesterday

Vgutierrez added a comment to T224977: puppet-catalog-compiler: compilation result randomly places servers in the 'failed' section.

After checking https://puppet-compiler.wmflabs.org/compiler1001/16855/ change error/warning logs for hosts marked as "fail to compile when the change is applied" it looks like two warnings are being interpreted as errors:

Warning: Unknown variable: '::restricted_to'. at /srv/jenkins-workspace/puppet-compiler/16855/change/src/modules/profile/manifests/ldap/client/labs.pp:5:72
Warning: Unknown variable: '::restricted_from'. at /srv/jenkins-workspace/puppet-compiler/16855/change/src/modules/profile/manifests/ldap/client/labs.pp:6:76
Tue, Jun 18, 11:58 AM · Operations, puppet-compiler
Vgutierrez committed rOSACb001cf023eaa: acme_chief: Enforce staging time validation (authored by Vgutierrez).
acme_chief: Enforce staging time validation
Tue, Jun 18, 10:20 AM
Vgutierrez committed rOSAC955e6d27dc40: acme_chief: Enforce staging time validation (authored by Vgutierrez).
acme_chief: Enforce staging time validation
Tue, Jun 18, 8:40 AM

Mon, Jun 17

Vgutierrez moved T225945: acme-chief staging time not working as expected from Triage to TLS on the Traffic board.
Mon, Jun 17, 3:46 PM · Patch-For-Review, Operations, Traffic, Acme-chief
Restricted Application added a project to T225945: acme-chief staging time not working as expected: Operations.
Mon, Jun 17, 3:45 PM · Patch-For-Review, Operations, Traffic, Acme-chief
Vgutierrez triaged T225945: acme-chief staging time not working as expected as High priority.
Mon, Jun 17, 3:44 PM · Patch-For-Review, Operations, Traffic, Acme-chief
Vgutierrez created T225945: acme-chief staging time not working as expected.
Mon, Jun 17, 3:44 PM · Patch-For-Review, Operations, Traffic, Acme-chief

Wed, Jun 12

Vgutierrez committed rOSACb041f762848d: x509: Expose the OCSP URI of a Certificate as a property (authored by Vgutierrez).
x509: Expose the OCSP URI of a Certificate as a property
Wed, Jun 12, 10:40 AM

Tue, Jun 11

Vgutierrez added a comment to T225484: cloudvirt servers: SSL certificate expiring.

I don't think we have any automation in place for internally issued certificates, and of course we cannot switch to LE for client certificates so acme-chief is not an option here.

Tue, Jun 11, 9:42 AM · cloud-services-team (Kanban)

Fri, Jun 7

Vgutierrez created P8599 (An Untitled Masterwork).
Fri, Jun 7, 1:17 PM
Vgutierrez archived P8598 (An Untitled Masterwork).
Fri, Jun 7, 1:17 PM
Vgutierrez created P8598 (An Untitled Masterwork).
Fri, Jun 7, 1:13 PM

Wed, Jun 5

Vgutierrez triaged T225096: Provide acme-chief/TLS SNI list support in compile_redirects() as Normal priority.
Wed, Jun 5, 3:14 PM · Patch-For-Review, HTTPS, Traffic, Operations
Vgutierrez created T225096: Provide acme-chief/TLS SNI list support in compile_redirects().
Wed, Jun 5, 1:56 PM · Patch-For-Review, HTTPS, Traffic, Operations
Vgutierrez closed T224428: ATS: traffic_layout currently forces to use its own copy of shared libraries as Resolved.
Wed, Jun 5, 8:30 AM · Traffic, Operations

Tue, Jun 4

Vgutierrez closed T220518: acme-chief: Validate that configured certificates can be actually issued, a subtask of T133548: Create a secure redirect service for large count of non-canonical / junk domains, as Resolved.
Tue, Jun 4, 12:39 PM · Goal, Patch-For-Review, HTTPS, Operations, Traffic
Vgutierrez closed T220518: acme-chief: Validate that configured certificates can be actually issued as Resolved.
Tue, Jun 4, 12:39 PM · Acme-chief, HTTPS, Traffic, Operations

Wed, May 29

Vgutierrez triaged T224539: Provide nginx support in compile_redirects() as Normal priority.
Wed, May 29, 7:33 AM · Patch-For-Review, Traffic, Operations
Vgutierrez created T224539: Provide nginx support in compile_redirects().
Wed, May 29, 7:33 AM · Patch-For-Review, Traffic, Operations

Tue, May 28

Vgutierrez committed rOSACaad632720b13: debian: Add release 0.17 to changelog (authored by Vgutierrez).
debian: Add release 0.17 to changelog
Tue, May 28, 11:09 AM
Vgutierrez created P8566 (An Untitled Masterwork).
Tue, May 28, 8:56 AM

Mon, May 27

Vgutierrez moved T224428: ATS: traffic_layout currently forces to use its own copy of shared libraries from Triage to Caching on the Traffic board.
Mon, May 27, 2:36 PM · Traffic, Operations
Vgutierrez triaged T224428: ATS: traffic_layout currently forces to use its own copy of shared libraries as Normal priority.
Mon, May 27, 2:36 PM · Traffic, Operations
Restricted Application added a project to T224428: ATS: traffic_layout currently forces to use its own copy of shared libraries: Operations.
Mon, May 27, 2:36 PM · Traffic, Operations
Vgutierrez created P8561 (An Untitled Masterwork).
Mon, May 27, 9:32 AM
Vgutierrez moved T224397: ATS: log mode cannot depend on log filters being configured from Triage to Caching on the Traffic board.
Mon, May 27, 7:50 AM · Traffic, Operations
Vgutierrez triaged T224397: ATS: log mode cannot depend on log filters being configured as Normal priority.
Mon, May 27, 7:49 AM · Traffic, Operations
Restricted Application added a project to T224397: ATS: log mode cannot depend on log filters being configured: Operations.
Mon, May 27, 7:48 AM · Traffic, Operations

Thu, May 23

Vgutierrez added a comment to T223902: cloudcontrol: decide on FQDN for service endpoints.

Right.. that ldap service certificate it's being handled by acme-chief and as Alex explained the *.wikimedia.org limitation only affects services that need to use https caching.

Thu, May 23, 8:08 PM · Operations, Traffic, Cloud-VPS, cloud-services-team (Kanban)
Vgutierrez added a comment to T223902: cloudcontrol: decide on FQDN for service endpoints.

That's right. Also take into account that uou can get as many certificates as you need from acme-chief, so maybe you don't need the wildcard one.

Thu, May 23, 5:13 PM · Operations, Traffic, Cloud-VPS, cloud-services-team (Kanban)

Wed, May 22

Vgutierrez moved T224119: ATS is currently adding its own server header from Triage to Caching on the Traffic board.
Wed, May 22, 1:36 PM · Operations, Traffic
Vgutierrez triaged T224119: ATS is currently adding its own server header as Normal priority.
Wed, May 22, 1:36 PM · Operations, Traffic
Vgutierrez created T224119: ATS is currently adding its own server header.
Wed, May 22, 1:36 PM · Operations, Traffic
Vgutierrez added a comment to T223902: cloudcontrol: decide on FQDN for service endpoints.

IMHO you should move away from *.wikimedia.org then and use another domain

Wed, May 22, 9:55 AM · Operations, Traffic, Cloud-VPS, cloud-services-team (Kanban)
Vgutierrez added a comment to T223902: cloudcontrol: decide on FQDN for service endpoints.

so, after a quick check you should consider several things:

  • wikimedia.org is a canonical domain for WMF, everything is expected to use secure TLS settings.
  • if you aim to use the production caching layer, the hostnames must match *.wikimedia.org
Wed, May 22, 9:26 AM · Operations, Traffic, Cloud-VPS, cloud-services-team (Kanban)

May 18 2019

Vgutierrez added a comment to T184293: rack/setup/install lvs101[3-6].
hostnicmac
lvs1013enp4s0f0F4:E9:D4:DB:0C:00
lvs1013enp4s0f1F4:E9:D4:DB:0C:02
lvs1013enp5s0f0F4:E9:D4:CF:40:D0
lvs1013enp5s0f1F4:E9:D4:CF:40:D2
lvs1014enp4s0f0F4:E9:D4:DB:27:40
lvs1014enp4s0f1F4:E9:D4:DB:27:42
lvs1014enp5s0f0F4:E9:D4:C8:88:F0
lvs1014enp5s0f1F4:E9:D4:C8:88:F2
May 18 2019, 9:29 PM · ops-eqiad, Operations, Traffic

May 16 2019

Vgutierrez added a comment to T223408: Page gets redirected randomly to former blackout page.

This issue can be reproduced searching lliga de campions 2017 in google using a mobile browser, the first result pointing to ca.wikipedia.org is https://ca.m.wikipedia.org/wiki/Viquip%C3%A8dia:Comunicat_24_de_mar%C3%A7

May 16 2019, 6:58 AM · Readers-Web-Backlog, Performance-Team (Radar), Wikimedia-Incident

May 14 2019

Dzahn awarded T131930: Set SPF (... -all) for toolserver.org a Yellow Medal token.
May 14 2019, 1:04 AM · cloud-services-team (Kanban), Traffic, Mail, Cloud-VPS, Patch-For-Review, Operations, DNS

May 13 2019

Vgutierrez closed T209707: tagged_interface sometimes exceeds IFNAMSIZ as Resolved.
May 13 2019, 3:31 PM · Traffic, Operations
Vgutierrez closed T209707: tagged_interface sometimes exceeds IFNAMSIZ, a subtask of T216724: relocate/reimage cloudvirt1024 with 10G interfaces, as Resolved.
May 13 2019, 3:31 PM · Patch-For-Review, Operations, cloud-services-team (Kanban)
Vgutierrez created P8519 (An Untitled Masterwork).
May 13 2019, 10:23 AM
Vgutierrez changed the status of T220786: Add SPF record for non-canonical domains that are not parked from Open to Stalled.
May 13 2019, 7:26 AM · Patch-For-Review, Operations, Traffic, DNS
Vgutierrez closed T131930: Set SPF (... -all) for toolserver.org as Resolved.
May 13 2019, 7:22 AM · cloud-services-team (Kanban), Traffic, Mail, Cloud-VPS, Patch-For-Review, Operations, DNS
Vgutierrez closed T131930: Set SPF (... -all) for toolserver.org, a subtask of T220786: Add SPF record for non-canonical domains that are not parked, as Resolved.
May 13 2019, 7:22 AM · Patch-For-Review, Operations, Traffic, DNS

May 7 2019

Vgutierrez added a comment to T209707: tagged_interface sometimes exceeds IFNAMSIZ.

so taking a deeper look into https://manpages.debian.org/jessie/vlan/vlan-interfaces.5.en.html:

vlan-raw-device devicename
Indicates the device to create the vlan on. This is ignored when the devicename is part of the vlan interface name.

May 7 2019, 6:27 PM · Traffic, Operations
Vgutierrez added a comment to T209707: tagged_interface sometimes exceeds IFNAMSIZ.

As discussed on IRC, using vlan-raw-device enp175s0f1d1 should be enough, as recommended in https://wiki.debian.org/NetworkConfiguration#Manual_config

May 7 2019, 5:23 PM · Traffic, Operations

May 6 2019

Vgutierrez triaged T222642: false positives in check_trafficserver_config_status as Normal priority.
May 6 2019, 5:28 PM · Operations, Traffic
Vgutierrez moved T222642: false positives in check_trafficserver_config_status from Triage to Caching on the Traffic board.
May 6 2019, 5:28 PM · Operations, Traffic
Vgutierrez created T222642: false positives in check_trafficserver_config_status.
May 6 2019, 5:27 PM · Operations, Traffic

May 3 2019

Vgutierrez created P8472 (An Untitled Masterwork).
May 3 2019, 3:30 PM
Vgutierrez updated the task description for T220383: Evaluate ATS TLS stack.
May 3 2019, 8:06 AM · Traffic, Operations
Vgutierrez updated the task description for T220383: Evaluate ATS TLS stack.
May 3 2019, 7:43 AM · Traffic, Operations

May 2 2019

Vgutierrez committed rOSACaeebce8fda81: Release 0.17 (authored by Vgutierrez).
Release 0.17
May 2 2019, 2:19 PM
Vgutierrez committed rOSAC2422525715e4: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
May 2 2019, 2:19 PM
Vgutierrez committed rOSAC6e2e5365d517: CI: Run tests with minimum and latest dependencies (authored by Vgutierrez).
CI: Run tests with minimum and latest dependencies
May 2 2019, 2:19 PM
Vgutierrez committed rOSAC3fe1747eebac: dns: Move DNS operations to its own module (authored by Vgutierrez).
dns: Move DNS operations to its own module
May 2 2019, 2:19 PM
Vgutierrez committed rOSAC809d846b4a66: config: Move ACMEChiefConfig to its own module (authored by Vgutierrez).
config: Move ACMEChiefConfig to its own module
May 2 2019, 2:19 PM

Apr 29 2019

Vgutierrez committed rOSACc7cedb2ad4be: Release 0.17 (authored by Vgutierrez).
Release 0.17
Apr 29 2019, 2:59 PM
Vgutierrez committed rOSAC307525ace887: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 29 2019, 2:58 PM
Vgutierrez added a comment to T222072: compiler1002.puppet-diffs.eqiad.wmflabs disk is full.

I've cleaned outputs older than 31 days, that gave us almost 5G:

root@compiler1002:/srv/jenkins-workspace/puppet-compiler/output# find ./ -type d -ctime +31 -maxdepth 1 -exec rm -rf {} +
root@compiler1002:/srv/jenkins-workspace/puppet-compiler/output# df -h
Filesystem                          Size  Used Avail Use% Mounted on
udev                                3.9G     0  3.9G   0% /dev
tmpfs                               799M   83M  717M  11% /run
/dev/vda3                            19G  3.2G   15G  19% /
tmpfs                               4.0G  4.0K  4.0G   1% /dev/shm
tmpfs                               5.0M     0  5.0M   0% /run/lock
tmpfs                               4.0G     0  4.0G   0% /sys/fs/cgroup
/dev/mapper/vd-second--local--disk   60G   52G  4.8G  92% /srv
Apr 29 2019, 2:34 PM · Patch-For-Review, Operations, puppet-compiler, Jenkins
Vgutierrez added a comment to T222041: cp3037 is currently unreachable.

Requested a power drain via remote hands:

Powered the server cp3037 for at least 15 seconds. After I plugged the power cable back in I was not be able to turn them manually on

Apr 29 2019, 10:20 AM · ops-esams, Operations, Traffic

Apr 28 2019

Vgutierrez triaged T222041: cp3037 is currently unreachable as Normal priority.
Apr 28 2019, 5:52 PM · ops-esams, Operations, Traffic
Vgutierrez moved T222041: cp3037 is currently unreachable from Triage to Hardware on the Traffic board.
Apr 28 2019, 5:52 PM · ops-esams, Operations, Traffic
Vgutierrez created T222041: cp3037 is currently unreachable.
Apr 28 2019, 5:52 PM · ops-esams, Operations, Traffic

Apr 26 2019

Vgutierrez created P8446 (An Untitled Masterwork).
Apr 26 2019, 12:34 PM

Apr 25 2019

Vgutierrez created P8439 (An Untitled Masterwork).
Apr 25 2019, 4:10 PM
Vgutierrez created P8438 (An Untitled Masterwork).
Apr 25 2019, 3:46 PM

Apr 24 2019

Vgutierrez added a comment to T220383: Evaluate ATS TLS stack.

so I guess it's affected but right now I'm working under the assumption that we will use stretch in the cp nodes, using our own ATS packaging.
@ema can confirm that :)

Apr 24 2019, 8:27 AM · Traffic, Operations
Vgutierrez added a comment to T220383: Evaluate ATS TLS stack.

We need to keep an eye on https://github.com/apache/trafficserver/issues/5084

Apr 24 2019, 7:04 AM · Traffic, Operations
Vgutierrez moved T221731: cp4021 - UNKNOWN: cannot run varnishstat from Triage to Caching on the Traffic board.
Apr 24 2019, 6:31 AM · Patch-For-Review, Operations, Traffic
Vgutierrez triaged T221731: cp4021 - UNKNOWN: cannot run varnishstat as Low priority.

that's expected, as @ema mentioned yesterday in -traffic:

<ema> so we've got cp4021 reimaged as Varnish/ATS and it seems to be looking kind-of OK
<ema> it is however still depooled as I haven't had the chance to look deeply at everything, and surely certain things are missing (like prometheus metrics not showing up in grafana yet)  also, we assume that all nodes in cache::upload::nodes need to both be listed as backends for varnish-fe and be involved in ipsec shenanigans the latter isn't true anymore with ATS, so that needs to be fixed too
<ema> (I've just ack'ed the alerts for now)
<ema> but we're getting close! :)
Apr 24 2019, 6:31 AM · Patch-For-Review, Operations, Traffic

Apr 23 2019

Vgutierrez committed rOSAC3a882a904cd4: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 23 2019, 12:56 PM
Vgutierrez committed rOSACa06ab2ad5407: CI: Run tests with minimum and latest dependencies (authored by Vgutierrez).
CI: Run tests with minimum and latest dependencies
Apr 23 2019, 9:00 AM
Vgutierrez moved T221594: Puppetize ATS TLS configuration for incoming traffic from Triage to TLS on the Traffic board.
Apr 23 2019, 8:33 AM · Patch-For-Review, Traffic, Operations
Vgutierrez triaged T221594: Puppetize ATS TLS configuration for incoming traffic as Normal priority.
Apr 23 2019, 8:33 AM · Patch-For-Review, Traffic, Operations
Vgutierrez created T221594: Puppetize ATS TLS configuration for incoming traffic.
Apr 23 2019, 8:33 AM · Patch-For-Review, Traffic, Operations

Apr 18 2019

Vgutierrez committed rOSACb8a07272a161: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 18 2019, 1:27 PM
Vgutierrez committed rOSAC12eca669aef3: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 18 2019, 1:22 PM
Vgutierrez committed rOSACc9cdc62d3978: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 18 2019, 1:05 PM
Vgutierrez committed rOSAC2c2653891ac3: dns: Move DNS operations to its own module (authored by Vgutierrez).
dns: Move DNS operations to its own module
Apr 18 2019, 1:05 PM
Vgutierrez added a comment to T221343: puppet fails to run in cp1008 under certain conditions.

for the record, LC_CTYPE=UTF-8

Apr 18 2019, 9:02 AM · Packaging, Puppet, Operations
Vgutierrez renamed T221343: puppet fails to run in cp1008 under certain conditions from puppet fails to run in cp1008 to puppet fails to run in cp1008 under certain conditions.
Apr 18 2019, 9:02 AM · Packaging, Puppet, Operations
Vgutierrez added a comment to T221343: puppet fails to run in cp1008 under certain conditions.

so... this is caused by my locales:

vgutierrez@cp1008:~$ unset LC_CTYPE
vgutierrez@cp1008:~$ sudo -i puppet agent -t
Warning: Support for ruby version 2.1.5 is deprecated and will be removed in a future release. See https://puppet.com/docs/puppet/latest/system_requirements.html for a list of supported ruby versions.
   (location: /usr/lib/ruby/vendor_ruby/puppet.rb:130:in `<module:Puppet>')
Warning: Downgrading to PSON for future requests
Info: Using configured environment 'production'
Info: Retrieving pluginfacts

but this was working as expected before

Apr 18 2019, 8:55 AM · Packaging, Puppet, Operations
Vgutierrez created T221343: puppet fails to run in cp1008 under certain conditions.
Apr 18 2019, 8:53 AM · Packaging, Puppet, Operations

Apr 17 2019

Vgutierrez moved T221217: Allow running several ATS instances on the same server from Triage to Caching on the Traffic board.
Apr 17 2019, 10:30 AM · Patch-For-Review, Operations, Traffic
Vgutierrez added a parent task for T221217: Allow running several ATS instances on the same server: T220383: Evaluate ATS TLS stack.
Apr 17 2019, 10:25 AM · Patch-For-Review, Operations, Traffic
Vgutierrez added a subtask for T220383: Evaluate ATS TLS stack: T221217: Allow running several ATS instances on the same server.
Apr 17 2019, 10:25 AM · Traffic, Operations
Vgutierrez created T221217: Allow running several ATS instances on the same server.
Apr 17 2019, 10:24 AM · Patch-For-Review, Operations, Traffic
Vgutierrez committed rOSAC2dc2c0a28d21: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSACbac82390f268: dns: Move DNS operations to its own module (authored by Vgutierrez).
dns: Move DNS operations to its own module
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSAC11b7f1118228: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSAC4148b5a493e2: config: Move ACMEChiefConfig to its own module (authored by Vgutierrez).
config: Move ACMEChiefConfig to its own module
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSACd44e30ac5202: dns: Move DNS operations to its own module (authored by Vgutierrez).
dns: Move DNS operations to its own module
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSACc7b80c5222eb: acme_chief: Prevalidate CN/SNI list (authored by Vgutierrez).
acme_chief: Prevalidate CN/SNI list
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSAC2885c7ef047a: config: Move ACMEChiefConfig to its own module (authored by Vgutierrez).
config: Move ACMEChiefConfig to its own module
Apr 17 2019, 8:45 AM
Vgutierrez committed rOSAC2aea9a3aa250: dns: Move DNS operations to its own module (authored by Vgutierrez).
dns: Move DNS operations to its own module
Apr 17 2019, 8:45 AM

Apr 16 2019

Vgutierrez updated the task description for T220786: Add SPF record for non-canonical domains that are not parked.
Apr 16 2019, 2:13 PM · Patch-For-Review, Operations, Traffic, DNS

Apr 15 2019

Vgutierrez added a comment to T219414: acme-chief fails to issue certificates against LE staging environment.

it looks like gdnsd sets a minimum TTL of 60 seconds for dns-01 ACME challenges:

acme_challenge_ttl
           Integer seconds, range 60-3600, default 600.  For temporary ACME DNS-01 challenge data added via "gdnsdctl acme-dns-01 ...", this sets both the time until the TXT records auto-expire from the server and dissappear, and also the TTL of the RRs themselves.  The TTL of static TXT records in zonefiles which happen to have "_acme-challenge" as their leading label are also forced to this TTL regardless of the zonefile-level TTL, to avoid cases of mixed TTLs when mixing static and dynamic records in server outputs.  See the gdnsdctl(8) documentation for more details.
Apr 15 2019, 9:08 AM · Patch-For-Review, Acme-chief

Apr 12 2019

Vgutierrez changed the status of T219414: acme-chief fails to issue certificates against LE staging environment from Stalled to Open.
Apr 12 2019, 2:08 PM · Patch-For-Review, Acme-chief
Vgutierrez updated subscribers of T219414: acme-chief fails to issue certificates against LE staging environment.

According to https://community.letsencrypt.org/t/unable-to-issue-ecdsa-rsa-in-acmev2-staging-environment/90835/9, LE caches dns-01 challenges for 60 seconds but they do respect lower TTLs:

Apr 12 2019, 1:55 PM · Patch-For-Review, Acme-chief