Page MenuHomePhabricator

Vgutierrez (Valentín Gutiérrez)
Traffic Security Engineer

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Feb 12 2018, 9:51 AM (166 w, 4 d)
Availability
Available
IRC Nick
vgutierrez
LDAP User
Vgutierrez
MediaWiki User
Unknown

Recent Activity

Mar 10 2021

Vgutierrez created P14720 (An Untitled Masterwork).
Mar 10 2021, 9:38 AM

Feb 12 2021

thcipriani awarded T274601: Requesting access to gerrit1001/gerrit1002 for brennen a Love token.
Feb 12 2021, 5:00 PM · User-brennen, SRE-Access-Requests, SRE
Vgutierrez closed T274601: Requesting access to gerrit1001/gerrit1002 for brennen as Resolved.
Feb 12 2021, 4:21 PM · User-brennen, SRE-Access-Requests, SRE
Vgutierrez triaged T274631: Requesting access to Analytic Cluster for Research Scientist (Paragon) as Medium priority.

patches ready, waiting for @leila's confirmation

Feb 12 2021, 4:08 PM · SRE, SRE-Access-Requests
Vgutierrez claimed T274304: Requesting access to Analytic Cluster for Research Intern (ChristineDeKock).

I'm unable to find a Wikitech account for the provided email address, @ChristineDeKock let me know if you want to use the existing account linked to your personal email or you want to create a separate one with the provided email (https://wikitech.wikimedia.org/wiki/Help:Create_a_Wikimedia_developer_account)

Feb 12 2021, 3:58 PM · SRE, SRE-Access-Requests
Vgutierrez updated subscribers of T274592: Apple Business Manager: verify ownership of wikimedia.org.

I've created the patch that adds the TXT record (https://gerrit.wikimedia.org/r/c/operations/dns/+/663794), could you review it @BBlack?

Feb 12 2021, 10:16 AM · Patch-For-Review, Traffic, DNS, SRE
Vgutierrez triaged T274592: Apple Business Manager: verify ownership of wikimedia.org as Medium priority.
Feb 12 2021, 10:02 AM · Patch-For-Review, Traffic, DNS, SRE

Feb 11 2021

Vgutierrez closed T274318: Requesting access to analytics-privatedata-users for urbanecm as Resolved.

Done, you should have an email regarding your kerberos account password @Urbanecm

Feb 11 2021, 5:03 PM · SRE, SRE-Access-Requests
Vgutierrez triaged T274318: Requesting access to analytics-privatedata-users for urbanecm as Medium priority.
Feb 11 2021, 2:05 PM · SRE, SRE-Access-Requests

Feb 9 2021

Vgutierrez triaged T274228: Phabricator should cache tasks for a few minutes for logged-out users as Medium priority.
Feb 9 2021, 10:48 AM · SRE, Traffic, Phabricator

Feb 8 2021

Vgutierrez closed T274103: interface-rps crashes on lsv4007 as Resolved.

Thanks @Volans and @jijiki

Feb 8 2021, 11:08 AM · Traffic, SRE
Vgutierrez triaged T274103: interface-rps crashes on lsv4007 as Unbreak Now! priority.
Feb 8 2021, 9:50 AM · Traffic, SRE
Vgutierrez created T274103: interface-rps crashes on lsv4007.
Feb 8 2021, 9:49 AM · Traffic, SRE

Feb 6 2021

Vgutierrez added a comment to T273956: acme-chief sometimes doesn't refresh certificates because it ignores SIGHUP.

I think I've seen acme-chief not responding to SIGHUP as expected before in deployment-prep, I worry this could happen in prod too.

Feb 6 2021, 1:30 PM · Acme-chief, cloud-services-team (Kanban)

Feb 4 2021

Vgutierrez added a comment to T231025: LegacyHandler.php: PHP Warning: Host lookup failed [-10002]: Unknown error -10002.

What's the current DNS query retry policy on mediawiki?

Feb 4 2021, 11:20 AM · Traffic, SRE, Platform Team Workboards (Clinic Duty Team), MediaWiki-Debug-Logger, Wikimedia-production-error
Vgutierrez moved T231025: LegacyHandler.php: PHP Warning: Host lookup failed [-10002]: Unknown error -10002 from Triage to DNS Infra on the Traffic board.
Feb 4 2021, 10:30 AM · Traffic, SRE, Platform Team Workboards (Clinic Duty Team), MediaWiki-Debug-Logger, Wikimedia-production-error
Vgutierrez added a project to T231025: LegacyHandler.php: PHP Warning: Host lookup failed [-10002]: Unknown error -10002: Traffic.
Feb 4 2021, 10:30 AM · Traffic, SRE, Platform Team Workboards (Clinic Duty Team), MediaWiki-Debug-Logger, Wikimedia-production-error

Jan 28 2021

Vgutierrez closed T273153: cp1087 needed a powercycle as Resolved.

Everything looking good.. purged had some troubles going through the backlog of PURGE requests... especially with varnish-fe. Considering that ats-be eventually caught up with the backlog and that varnish-fe cache was empty due to the server powercycle I triggered a purged restart and things went back to normal:

Jan 28 2021, 8:36 AM · SRE, Traffic

Jan 21 2021

Vgutierrez updated the task description for T271421: Test envoyproxy as a WMF's CDN TLS terminator with real traffic.
Jan 21 2021, 2:05 PM · SRE, Traffic

Jan 18 2021

Vgutierrez added a comment to T271407: Upgrade envoyproxy to 1.16.2.

there are some issues with the python requirements of envoy 1.16.2 as it requires python 3.6 or higher and clearly the building environment isn't fulfilling the requirement. So a tiny workaround is required to build 1.16.2 on builder-envoy-03:

Jan 18 2021, 3:08 PM · SRE, serviceops, Traffic
Vgutierrez moved T272258: lvs1015 interface errors from Triage to LoadBalancer on the Traffic board.
Jan 18 2021, 11:46 AM · SRE, Traffic, ops-eqiad

Jan 12 2021

Vgutierrez added a comment to T271808: The certificate for upload.beta.wmflabs.org expired on January 12, 2021..
root@deployment-cache-upload06:/etc/acmecerts/unified/live# openssl x509 -dates -noout -in rsa-2048.crt
notBefore=Jan 12 01:23:09 2021 GMT
notAfter=Apr 12 01:23:09 2021 GMT
root@deployment-cache-upload06:/etc/acmecerts/unified/live# touch /srv/trafficserver/tls/etc/ssl_multicert.config
root@deployment-cache-upload06:/etc/acmecerts/unified/live# systemctl reload trafficserver-tls.service
Jan 12 2021, 1:32 PM · SRE, Traffic, HTTPS, Beta-Cluster-reproducible

Jan 7 2021

Vgutierrez added a subtask for T271421: Test envoyproxy as a WMF's CDN TLS terminator with real traffic: T271407: Upgrade envoyproxy to 1.16.2.
Jan 7 2021, 1:53 PM · SRE, Traffic
Vgutierrez added a parent task for T271407: Upgrade envoyproxy to 1.16.2: T271421: Test envoyproxy as a WMF's CDN TLS terminator with real traffic.
Jan 7 2021, 1:53 PM · SRE, serviceops, Traffic
Vgutierrez triaged T271421: Test envoyproxy as a WMF's CDN TLS terminator with real traffic as Medium priority.
Jan 7 2021, 1:53 PM · SRE, Traffic
Vgutierrez created T271421: Test envoyproxy as a WMF's CDN TLS terminator with real traffic.
Jan 7 2021, 1:52 PM · SRE, Traffic
Vgutierrez updated subscribers of T271407: Upgrade envoyproxy to 1.16.2.
Jan 7 2021, 11:26 AM · SRE, serviceops, Traffic
Vgutierrez triaged T271407: Upgrade envoyproxy to 1.16.2 as Medium priority.
Jan 7 2021, 11:13 AM · SRE, serviceops, Traffic
Vgutierrez moved T271407: Upgrade envoyproxy to 1.16.2 from Triage to TLS on the Traffic board.
Jan 7 2021, 11:13 AM · SRE, serviceops, Traffic
Vgutierrez created T271407: Upgrade envoyproxy to 1.16.2.
Jan 7 2021, 11:12 AM · SRE, serviceops, Traffic

Jan 3 2021

Vgutierrez added a comment to T271063: acme-chief ldap certs required chained (with intermediate CA) versions suddenly.

acme-chief generated a valid certificate, the main difference between the current and the previous one is the intermediate CA that issued the cert:

root@acmechief1001:/var/lib/acme-chief/certs/ldap# openssl x509 -dates -issuer -noout -in cae12c858fa6417d8d999bfaef1c25ec/rsa-2048.crt
notBefore=Nov  4 13:00:48 2020 GMT
notAfter=Feb  2 13:00:48 2021 GMT
issuer=C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
root@acmechief1001:/var/lib/acme-chief/certs/ldap# openssl x509 -dates -issuer -noout -in live/rsa-2048.crt
notBefore=Jan  3 13:00:28 2021 GMT
notAfter=Apr  3 13:00:28 2021 GMT
issuer=C = US, O = Let's Encrypt, CN = R3
Jan 3 2021, 4:19 PM · cloud-services-team (Kanban), LDAP, Acme-chief
Vgutierrez created P13640 (An Untitled Masterwork).
Jan 3 2021, 3:58 PM

Dec 17 2020

Vgutierrez added a comment to T270129: Puppet failing on tools-legacy-redirector.

However https://gerrit.wikimedia.org/r/c/operations/software/acme-chief/+/617680 shipped as part of acme-chief 0.28 (https://gerrit.wikimedia.org/r/c/operations/software/acme-chief/+/618081) should prevent it from happening again

Dec 17 2020, 9:47 AM · cloud-services-team (Kanban), Toolforge
Vgutierrez added a comment to T270129: Puppet failing on tools-legacy-redirector.

this is a common glitch triggered by acme-chief getting updated (and automatically restarted) and the API service (uwsgi-acme-chief) getting stuck on the old version. We can't restart it automatically cause it could potentially trigger puppet run issues on acme-chief clients

Dec 17 2020, 9:45 AM · cloud-services-team (Kanban), Toolforge

Dec 4 2020

Vgutierrez created P13528 (An Untitled Masterwork).
Dec 4 2020, 1:19 PM

Nov 26 2020

Vgutierrez added a comment to T257537: Configure HTTPS for pywikibot.org.

Indeed

willikins:~ vgutierrez$ curl -v https://pywikibot.org 2>&1| grep -i location:
< location: https://pywikibot.toolforge.org/
Nov 26 2020, 10:36 AM · User-Ladsgroup, HTTPS, SRE, Traffic, Pywikibot

Nov 25 2020

Vgutierrez created P13410 Varnish 6.0.6 VS h2spec 2.6.0.
Nov 25 2020, 11:54 AM
Vgutierrez created P13409 HTTP/2 vs HTTP/1.X on cp3050 (text) and cp3051 (upload).
Nov 25 2020, 11:53 AM

Nov 23 2020

Vgutierrez committed rLPRIca625fc36ece: secrets: Add dummy digicert-2020 keys (authored by Vgutierrez).
secrets: Add dummy digicert-2020 keys
Nov 23 2020, 10:32 AM

Nov 19 2020

Vgutierrez updated subscribers of T257536: http://pywikibot.org/ is displaying Wikimedia error page.

@Ladsgroup I've already merged your DNS and acme-chief patches, due to the staging time that we enforce for the unified and the ncredir certificates, this should be working by next week:

Nov 19 08:37:39 acmechief1001 acme-chief-backend[28500]: Staging_time will be enforced for non-canonical-redirect-4 / rsa-2048 till 2020-11-26 07:37:37
Nov 19 2020, 8:44 AM · User-Ladsgroup, Patch-For-Review, Traffic, cloud-services-team (Kanban), SRE, Pywikibot

Nov 16 2020

Vgutierrez closed T267561: Beta needs to be upgraded to Varnish 6 as Resolved.

manually ran apt upgrade and puppet afterwards.. everything seems ok on deployment-cache-upload06

Nov 16 2020, 5:04 PM · User-Ryasmeen, SRE, Beta-Cluster-Infrastructure, Traffic
Vgutierrez added a comment to T267006: Puppet failures on many hosts.

puppet seems to be happy on deployment-cache-upload06 after upgrading to varnish 6

Nov 16 2020, 5:01 PM · Beta-Cluster-Infrastructure
Vgutierrez reopened T267561: Beta needs to be upgraded to Varnish 6 as "Open".

Re-opening as mentioned in https://phabricator.wikimedia.org/T267006#6624466 deployment-cache-upload06 has been omitted and needs to be upgraded to varnish 6

Nov 16 2020, 4:29 PM · User-Ryasmeen, SRE, Beta-Cluster-Infrastructure, Traffic
Vgutierrez added a comment to T267006: Puppet failures on many hosts.

it looks like deployment-cache-upload06 has been omitted in T267561:

vgutierrez@deployment-cache-upload06:/etc/varnish$ dpkg -l |grep varnish
ii  libvarnishapi1                       5.1.3-1wm15                  amd64        shared libraries for Varnish
ii  libvarnishapi2:amd64                 6.1.1-1+deb10u1              amd64        shared libraries for Varnish
ii  prometheus-varnish-exporter          1.4.1-1                      amd64        Prometheus exporter for Varnish
ii  prometheus-varnishkafka-exporter     0.1-1                        all          A Prometheus exporter daemon that exports metrics from varnishkafka logs.
ii  varnish                              5.1.3-1wm15                  amd64        state of the art, high-performance web accelerator
ii  varnish-dbg                          5.1.3-1wm15                  amd64        debugging symbols for varnish
ii  varnish-modules                      0.12.1-1+wmf2                amd64        Varnish module collection
ii  varnishkafka                         1.0.14-1                     amd64        Varnish to Kafka log streamer
Nov 16 2020, 4:23 PM · Beta-Cluster-Infrastructure
Vgutierrez closed T258405: Deprecate TLSv1.2 weak ciphersuites as Resolved.
Nov 16 2020, 3:49 PM · User-notice, Patch-For-Review, SRE, Traffic
Vgutierrez added a comment to T267858: The certificate for upload.beta.wmflabs.org expired on November 13, 2020..

@Vgutierrez FYI in case this could happen in prod too, I haven't been keeping track of changes lately. If we think it won't happen again or won't happen in prod (e.g. maybe it didn't restart because puppet is erroring somewhere in varnish code on this box?) then I guess we can close this

Nov 16 2020, 2:30 PM · SRE, Traffic, HTTPS, Beta-Cluster-Infrastructure
Vgutierrez moved T267867: purged is not resilient to kafka main nodes going down from Triage to Caching on the Traffic board.
Nov 16 2020, 9:36 AM · Traffic, SRE

Nov 15 2020

Vgutierrez added a comment to T267865: Switch on rack C7 in codfw is down.

switching over to lvs2010 as it will allow us to recover cp2035, only losing cp2037 on text and cp2038 on upload VS losing cp2035 and cp2037 on text with lvs2007

Nov 15 2020, 11:16 AM · ops-codfw, netops, SRE

Nov 9 2020

Vgutierrez closed T219414: acme-chief fails to issue certificates against LE staging environment as Resolved.

sorry @Aklapper, this can be closed. @BBlack side has been fixed on https://github.com/gdnsd/gdnsd/commit/4818a260e62a74316e6cb8b6672107de81030630 as I've mentioned on my previous comment on this task

Nov 9 2020, 9:19 AM · Acme-chief

Oct 29 2020

Vgutierrez closed T265911: ATS trying to set socket options SO_MARK / IP_TOS as Resolved.
vgutierrez@cumin1001:~$ sudo -i cumin 'A:cp' 'apt-cache policy trafficserver|grep Installed'
72 hosts will be targeted:
cp[2027-2042].codfw.wmnet,cp[1075-1090].eqiad.wmnet,cp[5001-5012].eqsin.wmnet,cp[3050-3065].esams.wmnet,cp[4021-4032].ulsfo.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(72) cp[2027-2042].codfw.wmnet,cp[1075-1090].eqiad.wmnet,cp[5001-5012].eqsin.wmnet,cp[3050-3065].esams.wmnet,cp[4021-4032].ulsfo.wmnet
----- OUTPUT of 'apt-cache policy...r|grep Installed' -----
  Installed: 8.0.8-1wm3
================
Oct 29 2020, 3:47 PM · SRE, Traffic
Vgutierrez lowered the priority of T266746: TCP traffic increase for DNS over TLS breached a low limit for max open files on authdns1001/2001 from High to Medium.
Oct 29 2020, 3:12 PM · SRE, Traffic

Oct 19 2020

Ladsgroup awarded T133548: Create a secure redirect service for large count of non-canonical / junk domains a Like token.
Oct 19 2020, 8:38 AM · Goal, HTTPS, Traffic, SRE

Oct 16 2020

Vgutierrez changed the status of T184715: pybal's "can-depool" logic only takes downServers into account from Open to Stalled.

this hasn't been backported to the 1.15 branch so it's never been deployed in production, I'd keep the task open

Oct 16 2020, 10:42 AM · Pybal, Traffic, SRE
Vgutierrez closed T265584: Wipe digicert-2019a from the caching cluster as Resolved.
Oct 16 2020, 7:32 AM · SRE, Traffic

Oct 15 2020

Vgutierrez triaged T265584: Wipe digicert-2019a from the caching cluster as Medium priority.
Oct 15 2020, 10:31 AM · SRE, Traffic
Vgutierrez created T265584: Wipe digicert-2019a from the caching cluster.
Oct 15 2020, 10:30 AM · SRE, Traffic

Oct 7 2020

Vgutierrez added a comment to T264074: varnishkafka 1.1.0 CPU usage increase.

@elukey double checking https://gerrit.wikimedia.org/r/plugins/gitiles/operations/debs/varnish4/+/refs/heads/debian-wmf/lib/libvarnishapi/vut.c#424 it looks like ed1696efc92cb6a9aa96d2b8e586be8dbbb1736b made it to varnish 6.0.6 as well

Oct 7 2020, 9:32 AM · Patch-For-Review, Analytics-Clusters, Traffic, SRE
Vgutierrez added a comment to T264074: varnishkafka 1.1.0 CPU usage increase.

yeah.. I'll handle the backport :)

Oct 7 2020, 9:22 AM · Patch-For-Review, Analytics-Clusters, Traffic, SRE

Oct 6 2020

Vgutierrez added a comment to T262946: Bump Firefox version in basic support to 3.6 or newer.

@Izno You're right. But even if that's the case right now, there will be gaps in the future again.

While awaiting the final feedback from Traffic and @ema, point in case with the necessary TLS 1.2 support (second time verified of Wikipedia not being accessible by Safari 6.2 on OS X Mountain Lion) were to broaden this task with bumping the Grade A browsers to the list already shared by @Esanders in T262946#6512792
if we agree to intertwine MediaWiki support with Wikipedia traffic/ops support.

Right now, Firefox 3 basic support vs proposed 3.6 is holding current needs like relying on rem back. So I'm open for both ways, limiting the task to the current description or – personally leaning slightly towards – broadening it up to be more future-facing in our development limitations.

Oct 6 2020, 1:45 PM · TechCom-RFC (TechCom-RFC-Closed), Browser-Support-Firefox, Front-end-Standards-Group, MediaWiki-General

Sep 29 2020

Vgutierrez updated the task description for T258405: Deprecate TLSv1.2 weak ciphersuites.
Sep 29 2020, 1:32 PM · User-notice, Patch-For-Review, SRE, Traffic

Sep 17 2020

Vgutierrez lowered the priority of T263006: Let's Encrypt transitioning to ISRG's Root from High to Medium.

acme-chief updated to version 0.29 in our production environment, the unified cert should be renewed tomorrow, we will check the offered chains then

Sep 17 2020, 11:11 AM · Patch-For-Review, Traffic, SRE, Acme-chief
Vgutierrez committed rOSACf714f4750f31: debian: Add release 0.29 to changelog (authored by Vgutierrez).
debian: Add release 0.29 to changelog
Sep 17 2020, 11:02 AM

Sep 16 2020

Vgutierrez added a comment to T263006: Let's Encrypt transitioning to ISRG's Root.

So I've prepared a 0.29 release shipping https://gerrit.wikimedia.org/r/q/topic:%22T263006%22+(status:open%20OR%20status:merged)

Sep 16 2020, 4:04 PM · Patch-For-Review, Traffic, SRE, Acme-chief
Vgutierrez created T263006: Let's Encrypt transitioning to ISRG's Root.
Sep 16 2020, 8:38 AM · Patch-For-Review, Traffic, SRE, Acme-chief

Sep 10 2020

Vgutierrez added a comment to T261632: Package varnish 6.0.x.

@ema I've added to the task description the CRs required to get the packages of all the vmods and varnishkafka, I've seen that we have varnish-modules compiled on deneb but I haven't found a CR.

Sep 10 2020, 3:57 PM · Analytics-Radar, Patch-For-Review, Traffic, SRE
Vgutierrez updated the task description for T261632: Package varnish 6.0.x.
Sep 10 2020, 2:40 PM · Analytics-Radar, Patch-For-Review, Traffic, SRE

Sep 8 2020

Vgutierrez updated the task description for T261632: Package varnish 6.0.x.
Sep 8 2020, 1:58 PM · Analytics-Radar, Patch-For-Review, Traffic, SRE
Vgutierrez triaged T262251: acme-chief shouldn't try to perform OCSP stapling of expired certs as Medium priority.
Sep 8 2020, 12:08 PM · Traffic, Acme-chief, cloud-services-team (Kanban), Cloud-VPS, SRE
Vgutierrez created T262251: acme-chief shouldn't try to perform OCSP stapling of expired certs.
Sep 8 2020, 10:02 AM · Traffic, Acme-chief, cloud-services-team (Kanban), Cloud-VPS, SRE

Sep 7 2020

Vgutierrez updated the task description for T261632: Package varnish 6.0.x.
Sep 7 2020, 3:04 PM · Analytics-Radar, Patch-For-Review, Traffic, SRE

Sep 1 2020

Vgutierrez updated the task description for T261632: Package varnish 6.0.x.
Sep 1 2020, 1:15 PM · Analytics-Radar, Patch-For-Review, Traffic, SRE

Aug 28 2020

Vgutierrez added a comment to T261528: SSL cert renewal warnings for cloudelastic100[5-6].wikimedia.org.

So.. as I can see on acmechief1001 the cert is been renewed as expected:

root@acmechief1001:~# openssl x509 -dates -noout -in /var/lib/acme-chief/certs/cloudelastic/live/ec-prime256v1.crt
notBefore=Aug  3 19:00:35 2020 GMT
notAfter=Nov  1 19:00:35 2020 GMT
root@acmechief1001:~# openssl x509 -dates -noout -in /var/lib/acme-chief/certs/cloudelastic/live/rsa-2048.crt
notBefore=Aug  3 19:00:43 2020 GMT
notAfter=Nov  1 19:00:43 2020 GMT
Aug 28 2020, 9:25 PM · Discovery-Search

Aug 24 2020

Vgutierrez updated the task description for T258405: Deprecate TLSv1.2 weak ciphersuites.
Aug 24 2020, 3:39 PM · User-notice, Patch-For-Review, SRE, Traffic

Aug 20 2020

Vgutierrez moved T260889: confd's watch functionality appears to be partially broken when interacting with etcd 3.x from Triage to LoadBalancer on the Traffic board.
Aug 20 2020, 9:58 AM · Traffic, conftool, serviceops, SRE

Aug 18 2020

Vgutierrez added a comment to T260702: Analyze custom varnish 5.1 patches considering the migration to varnish 6.
patchbackport/customavailable on varnish 6.0available on varnish 6.4can be removed?
0002-exp-thread-realtime.patchcustomnonoTBD (varnish-be specific)
0003-vsm-perms.patchcustomnonono
0004-storage-file-off-t.patchcustomnonoTBD (varnish-be specific)
0005-stats-shortlived.patchcustomnonono
0006-transaction-timeout.patchcustomnonoyes (adds a config parameter that is currently unused)
0007-varnishncsa-record-prefix.patchbackportyesyesyes
0008-vsv00002-5.1.patchbackportyesyesyes
0011-fix-discarding-labelsbackportyesyesyes
0012-oh-leak.patchbackportyesyesyes
0013-issue-1799.patchbackportyesyesyes
0014-n_lru_limited-counter.patchbackportyesyesyes
0015-cache_hit_grace-counter.patchbackportyesyesyes
0016-expired-objects-ignore-req.ttl.patchbackportyesyesyes
0017-new-ttl-in-vcl-calculation.patchbackportyesyesyes
0018-post-and-multiple-vcl.patchbackportyesyesyes
0019-vary-stevedore-mem-leak.patchbackportyesyesyes
0020-assert-error-http1_minimal_response.patchbackportyesyesyes
0021-dont-test-gunzip-partial.patchbackportyesyesyes
0022-deref-objcore-synth-err.patchbackportyesyesyes
0023-pass-delivery-is-no-err.patchbackportyesyesyes
0024-vbt-get-force-fresh.patchbackportyesyesyes
0025-extrachance-one-retry.patchbackportyesyesyes
0026-transient-full-cache_req_body-panic.patchbackportyesyesyes
0027-assert-error-vca_make_session.patchbackportyesyesyes
0028-panic-return-cond-fetch.patchbackportyesyesyes
0029-ban-lurker-bo-backoff.patchbackportyesyesyes
0030-startup-show-version.patchbackportyesyesyes
0031-vbt-close-stolen.patchbackportyesyesyes
0032-vbe_dir_finish-no-VBT_Wait.patchbackportyesyesyes
0033-recycled-honor-first_byte_timeout.patchbackportyesyesyes
0034-r02135.vtc-fixes.patchbackportyesyesyes
0035-vbf_stp_condfetch_crash.patchbackportyesnoyes iff target version is 6.0
0036-VSV00004.patchbackportyesyesyes
0037-force-discard.patchcustomnonoyes (failed experiment)
0038-vcl_active-lock.patchbackportyesyesyes
Aug 18 2020, 4:20 PM · Patch-For-Review, SRE, Traffic
Vgutierrez triaged T260702: Analyze custom varnish 5.1 patches considering the migration to varnish 6 as Medium priority.
Aug 18 2020, 3:38 PM · Patch-For-Review, SRE, Traffic
Vgutierrez created T260702: Analyze custom varnish 5.1 patches considering the migration to varnish 6.
Aug 18 2020, 3:38 PM · Patch-For-Review, SRE, Traffic

Aug 17 2020

Vgutierrez closed T260279: Add DVrandecic to group nda as Declined.

as I mentioned on my previous comment, being part of the wmf LDAP group is enough. @DVrandecic could you point us to the onboarding documentation that you've been following to get it updated? Thanks

Aug 17 2020, 7:46 AM · SRE, LDAP-Access-Requests, WMF-NDA-Requests

Aug 14 2020

Vgutierrez added a comment to T260279: Add DVrandecic to group nda.

From https://wikitech.wikimedia.org/wiki/LDAP/Groups:

wmf - for WMF staff/contractors (documented below)
ops - for operations people (see ops group in puppet manifests/site.pp) (documented below)
nda - for others who have signed NDAs for access to confidential data (documented below)

and as https://ldap.toolforge.org/group/wmf indicates, @DVrandecic is already a member of the wmf ldap group. I'd say we can close this task

Aug 14 2020, 10:20 AM · SRE, LDAP-Access-Requests, WMF-NDA-Requests
Vgutierrez triaged T260279: Add DVrandecic to group nda as Medium priority.
Aug 14 2020, 8:51 AM · SRE, LDAP-Access-Requests, WMF-NDA-Requests

Aug 12 2020

Vgutierrez triaged T259979: Redirect wikimedia.org/research to research.wikimedia.org instead of some external closed survey as Low priority.
Aug 12 2020, 2:17 PM · Research, Wikimedia-Apache-configuration, SRE
Vgutierrez triaged T260240: UNIX group 'bird' missing on bird package installation as Medium priority.
Aug 12 2020, 1:50 PM · observability, Cloud-VPS, SRE

Aug 11 2020

Vgutierrez added a comment to T238593: Phabricator downtime due to aphlict and websockets (aphlict current disabled).

hmm that's interesting, please note that this is not the first time we use websockets. etherpad.wm.o is already using websockets successfully (even when HTTP/2 is used to perform the upgrade request)

Aug 11 2020, 1:06 PM · Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1)), Patch-For-Review, Phabricator, serviceops, SRE, Traffic

Aug 10 2020

Vgutierrez closed T259388: Requesting access to production shell for Denny Vrandecic as Resolved.

@DVrandecic will also need a kerberos password

Aug 10 2020, 4:55 PM · Analytics-Radar, SRE-Access-Requests, SRE
Vgutierrez claimed T259388: Requesting access to production shell for Denny Vrandecic.
Aug 10 2020, 4:27 PM · Analytics-Radar, SRE-Access-Requests, SRE
Vgutierrez added a comment to T259388: Requesting access to production shell for Denny Vrandecic.

@akosiaris is on vacations, I'll handle this ASAP

Aug 10 2020, 4:26 PM · Analytics-Radar, SRE-Access-Requests, SRE
Vgutierrez awarded Blog Post: RPKI Origin Validation a Love token.
Aug 10 2020, 1:39 PM · netops

Aug 4 2020

Vgutierrez added a comment to T257968: Certificate for *.beta.wmflabs.org has expired.

I think we have two bugs here:

  1. The API service must be restarted as well after an acme-chief upgrade.
  2. The API service shouldn't list non allowed files, I'd suspect that dropping a file on the cert directory would break puppet on the acme-chief clients right now.

I'm currently on vacations, I'll handle those issues next week

Aug 4 2020, 10:54 AM · Beta-Cluster-Infrastructure
Vgutierrez closed T259338: do not generate metadata for parts that aren't allowed as Resolved.
Aug 4 2020, 10:52 AM · Patch-For-Review, SRE, Traffic, Acme-chief
Vgutierrez closed T259338: do not generate metadata for parts that aren't allowed, a subtask of T257968: Certificate for *.beta.wmflabs.org has expired, as Resolved.
Aug 4 2020, 10:52 AM · Beta-Cluster-Infrastructure

Jul 31 2020

Vgutierrez triaged T259338: do not generate metadata for parts that aren't allowed as Medium priority.
Jul 31 2020, 11:03 AM · Patch-For-Review, SRE, Traffic, Acme-chief
Vgutierrez created T259338: do not generate metadata for parts that aren't allowed.
Jul 31 2020, 11:03 AM · Patch-For-Review, SRE, Traffic, Acme-chief

Jul 30 2020

Vgutierrez added a comment to T255249: acme-chief: support for generating a concatenated cert/key file.

@bd808 acme-chief 0.27 shipping your changes has been deployed in production. Please note that your change will be effective the next time acme-chief reissues your cert.

Jul 30 2020, 1:54 PM · Patch-For-Review, Acme-chief

Jul 28 2020

Vgutierrez closed T238724: ATS logs aren't being rotated as Resolved.
Jul 28 2020, 10:38 AM · SRE, Traffic
Vgutierrez closed T242620: ats-tls is having issues when varnish-fe goes away as Resolved.
Jul 28 2020, 10:37 AM · Patch-For-Review, SRE, Traffic
Vgutierrez closed T256632: cp3053 nvme0 issues as Resolved.

Thanks for pinging me @wiki_willy, we can close to this task, everything seems good in cp3053 so far. I'll reopen the task if needed

Jul 28 2020, 8:54 AM · DC-Ops, ops-esams, SRE, Traffic

Jul 21 2020

Ladsgroup awarded T238625: Remove nginx puppetization for cache text/text_ats a Yellow Medal token.
Jul 21 2020, 1:44 PM · Patch-For-Review, Traffic, SRE

Jul 20 2020

Vgutierrez triaged T258405: Deprecate TLSv1.2 weak ciphersuites as Medium priority.
Jul 20 2020, 2:13 PM · User-notice, Patch-For-Review, SRE, Traffic
Vgutierrez moved T258405: Deprecate TLSv1.2 weak ciphersuites from Triage to TLS on the Traffic board.
Jul 20 2020, 2:05 PM · User-notice, Patch-For-Review, SRE, Traffic
Vgutierrez created T258405: Deprecate TLSv1.2 weak ciphersuites.
Jul 20 2020, 2:05 PM · User-notice, Patch-For-Review, SRE, Traffic