Page MenuHomePhabricator

Vgutierrez (Valentín Gutiérrez)
Staff Site Reliability Engineer, Traffic Team

Projects (14)

Today

  • No visible events.

Tomorrow

  • No visible events.

Friday

  • No visible events.

User Details

User Since
Feb 12 2018, 9:51 AM (427 w, 2 d)
Availability
Available
IRC Nick
vgutierrez
LDAP User
Vgutierrez
MediaWiki User
VGutiérrez (WMF) [ Global Accounts ]

Recent Activity

Fri, Apr 10

Vgutierrez moved T422030: Surge in webrequest validation check from Backlog to Radar/Not for Service on the Traffic board.
Fri, Apr 10, 3:09 PM · Patch-For-Review, Data-Platform-SRE (2026-03-27 - 2026-04-17), Data-Engineering (Q4 FS25/26 April 1st - June 30st), Traffic
Vgutierrez added a comment to T422926: Thumbor is using an unmantained HAProxy version.

meanwhile I've restored the component and uploaded haproxy 2.8.20 there

Fri, Apr 10, 10:27 AM · Patch-For-Review, Thumbor, ServiceOps-Services-Oids, Traffic, ServiceOps new
Vgutierrez triaged T422926: Thumbor is using an unmantained HAProxy version as High priority.
Fri, Apr 10, 10:09 AM · Patch-For-Review, Thumbor, ServiceOps-Services-Oids, Traffic, ServiceOps new
Vgutierrez created T422926: Thumbor is using an unmantained HAProxy version.
Fri, Apr 10, 10:09 AM · Patch-For-Review, Thumbor, ServiceOps-Services-Oids, Traffic, ServiceOps new
Vgutierrez closed T422030: Surge in webrequest validation check as Resolved.

Closing the task as this has been investigated

Fri, Apr 10, 8:08 AM · Patch-For-Review, Data-Platform-SRE (2026-03-27 - 2026-04-17), Data-Engineering (Q4 FS25/26 April 1st - June 30st), Traffic

Wed, Apr 8

Vgutierrez added a comment to T422030: Surge in webrequest validation check.

I've replicated locally a SSL handshake failure using haproxy with log-format-sd %{+E}o\ [haproxykafka@0\ %[capture.req.hdr(0),json(ascii)]|%HPO|%HQ|%rt]

Wed, Apr 8, 9:37 AM · Patch-For-Review, Data-Platform-SRE (2026-03-27 - 2026-04-17), Data-Engineering (Q4 FS25/26 April 1st - June 30st), Traffic
Vgutierrez added a comment to T422030: Surge in webrequest validation check.

Yes, sequence numbers are enerated by haproxy itself, even if it results in a SSL handshake error where the sequence number doesn't reach haproxykafka when using haproxy 3.0 because the log format is ignored for that kind of error.

Wed, Apr 8, 8:26 AM · Patch-For-Review, Data-Platform-SRE (2026-03-27 - 2026-04-17), Data-Engineering (Q4 FS25/26 April 1st - June 30st), Traffic

Tue, Apr 7

Vgutierrez added a comment to T422030: Surge in webrequest validation check.

It looks like the root cause is MEDIUM: log/session: handle embryonic session log within sess_log(). A change introduced in HAProxy 3.1 as part of the work done to introduce the log profiles feature.

Tue, Apr 7, 4:12 PM · Patch-For-Review, Data-Platform-SRE (2026-03-27 - 2026-04-17), Data-Engineering (Q4 FS25/26 April 1st - June 30st), Traffic

Mon, Apr 6

Vgutierrez moved T422040: Migrate clouddumps https/rsync interfaces behind LVS from Backlog to Radar/Not for Service on the Traffic board.
Mon, Apr 6, 12:28 PM · Traffic, Data-Services, tools-infrastructure-team, Datasets-General-or-Unknown
Vgutierrez added a project to T422040: Migrate clouddumps https/rsync interfaces behind LVS: Traffic.
Mon, Apr 6, 12:28 PM · Traffic, Data-Services, tools-infrastructure-team, Datasets-General-or-Unknown
Vgutierrez added a comment to T422030: Surge in webrequest validation check.

This is most probably due to a deprecation in haproxy configuration directives https://www.haproxy.com/blog/reviewing-every-new-feature-in-haproxy-3-1#deprecation, especially

option     accept-invalid-http-request
option     accept-invalid-http-response
Mon, Apr 6, 8:16 AM · Patch-For-Review, Data-Platform-SRE (2026-03-27 - 2026-04-17), Data-Engineering (Q4 FS25/26 April 1st - June 30st), Traffic
Vgutierrez closed T417291: Upgrade the CDN to HAProxy 3.0, a subtask of T401832: Upgrade Traffic hosts to trixie, as Resolved.
Mon, Apr 6, 8:06 AM · Traffic
Vgutierrez closed T417291: Upgrade the CDN to HAProxy 3.0 as Resolved.
Mon, Apr 6, 8:06 AM · Traffic

Mon, Mar 30

Vgutierrez created P89966 (An Untitled Masterwork).
Mon, Mar 30, 11:14 AM

Thu, Mar 26

Vgutierrez triaged T421402: Upgrade HAProxy to version 3.2 as Medium priority.
Thu, Mar 26, 4:42 PM · Traffic
Vgutierrez created P89950 (An Untitled Masterwork).
Thu, Mar 26, 3:40 PM

Wed, Mar 25

Vgutierrez added a comment to T421246: haproxy_client_healthcheck_ttfb histogram seems to be missing on upload.

this is a side effect of moving healthchecks on upload from healthcheck.wm.org to upload.wm.o in https://gerrit.wikimedia.org/r/c/operations/puppet/+/1164466

Wed, Mar 25, 4:57 PM · Traffic
Vgutierrez added a comment to T420498: Factor in pooled status for SLO measurements.

For instance, a recent decommissioning of codfw cp nodes (T419753) left the legacy ats-be service unavailable and caused the (depooled!) nodes to increment varnish_sli_bad as there were no servers left in the site according to them

Wed, Mar 25, 4:48 PM · SRE-SLO, observability, Traffic
Vgutierrez triaged T421246: haproxy_client_healthcheck_ttfb histogram seems to be missing on upload as Medium priority.
Wed, Mar 25, 3:24 PM · Traffic
Vgutierrez created T421246: haproxy_client_healthcheck_ttfb histogram seems to be missing on upload.
Wed, Mar 25, 3:23 PM · Traffic
Vgutierrez added a comment to T421203: Bad ATS config led to large volume of 5xx from RESTBase.

There is no CI check for validity of lua scripts loaded in trafficserver; some have functional tests but a syntax error should have never passed CI, even in the absence of tests

Wed, Mar 25, 12:03 PM · Incident Severity 3, Traffic, Wikimedia-Incident

Mar 18 2026

Vgutierrez added a comment to T367973: Replace ping offload servers with eBPF.

this is a by-product of switching to Katran and not a feature that we can deploy independently at the moment, so in eqiad and codfw this is currently blocked till we can switch every service to IPIP encapsulation

Mar 18 2026, 11:56 AM · Liberica, Traffic

Mar 12 2026

Vgutierrez triaged T419873: Provide a cookbook that validates IPIP/IP6IP6 capabilities on a given realserver as Medium priority.
Mar 12 2026, 3:32 PM · Patch-For-Review, Liberica, Traffic
Vgutierrez created T419873: Provide a cookbook that validates IPIP/IP6IP6 capabilities on a given realserver.
Mar 12 2026, 3:32 PM · Patch-For-Review, Liberica, Traffic

Mar 11 2026

Vgutierrez closed T419352: acme-chief is unable to validate challenges against GTS staging environment as Resolved.
Mar 11 2026, 11:39 AM · Upstream, Traffic, Acme-chief

Mar 9 2026

Vgutierrez lowered the priority of T419352: acme-chief is unable to validate challenges against GTS staging environment from High to Medium.

with the patch applied acme-chief was able to issue the certificate a few hours after reporting the issue to GTS (no active answer from them though)

Mar 9 2026, 5:45 PM · Upstream, Traffic, Acme-chief
Vgutierrez triaged T419352: acme-chief is unable to validate challenges against GTS staging environment as High priority.
Mar 9 2026, 2:29 PM · Upstream, Traffic, Acme-chief
Vgutierrez added a project to T419352: acme-chief is unable to validate challenges against GTS staging environment: Upstream.

I've tried to patch our client to skip already validated challenges but I'm running into another issue, this is the request flow performed by acme-chief:

Mar 9 2026, 1:36 PM · Upstream, Traffic, Acme-chief

Mar 8 2026

Vgutierrez created T419352: acme-chief is unable to validate challenges against GTS staging environment.
Mar 8 2026, 8:27 PM · Upstream, Traffic, Acme-chief

Mar 6 2026

Vgutierrez closed T417253: Upgrade to HAProxy 3.0 on cache (bullseye) hosts, a subtask of T401832: Upgrade Traffic hosts to trixie, as Resolved.
Mar 6 2026, 4:36 PM · Traffic
Vgutierrez closed T417253: Upgrade to HAProxy 3.0 on cache (bullseye) hosts as Resolved.
Mar 6 2026, 4:36 PM · Traffic

Mar 5 2026

Vgutierrez closed T419149: HAProxy 3.0 fires http-after-response rules twice on 500s resulting on KC session state as Resolved.
Mar 5 2026, 6:12 PM · Traffic
Vgutierrez updated the task description for T419149: HAProxy 3.0 fires http-after-response rules twice on 500s resulting on KC session state.
Mar 5 2026, 5:12 PM · Traffic
Vgutierrez triaged T419149: HAProxy 3.0 fires http-after-response rules twice on 500s resulting on KC session state as High priority.
Mar 5 2026, 5:10 PM · Traffic
Vgutierrez created T419149: HAProxy 3.0 fires http-after-response rules twice on 500s resulting on KC session state.
Mar 5 2026, 5:10 PM · Traffic

Mar 4 2026

Vgutierrez updated the task description for T417253: Upgrade to HAProxy 3.0 on cache (bullseye) hosts.
Mar 4 2026, 11:23 AM · Traffic
Vgutierrez added a comment to T418991: JWT tokens issued with empty subject.

this seems to be tracked as T417278

Mar 4 2026, 10:52 AM · MediaWiki-Platform-Team, JWTAuth, Traffic
Vgutierrez triaged T418991: JWT tokens issued with empty subject as High priority.
Mar 4 2026, 10:49 AM · MediaWiki-Platform-Team, JWTAuth, Traffic
Vgutierrez created T418991: JWT tokens issued with empty subject.
Mar 4 2026, 10:49 AM · MediaWiki-Platform-Team, JWTAuth, Traffic
Vgutierrez updated the task description for T417253: Upgrade to HAProxy 3.0 on cache (bullseye) hosts.
Mar 4 2026, 10:10 AM · Traffic

Mar 2 2026

Vgutierrez closed T417825: CDN: include x-is-browser in all requests to the backend, a subtask of T417778: rest gateway: enforce rate limits (stage one), as Resolved.
Mar 2 2026, 1:59 PM · MediaWiki-Platform-Team (Radar), OKR-Work, MW-Interfaces-Team
Vgutierrez closed T417825: CDN: include x-is-browser in all requests to the backend as Resolved.

@Vgutierrez , this is part of the API rate limiting rollout prep, and we’re trying to confirm whether everything needed for the March 9 phase is on track.

Do you have a sense of current status and expected timeline for this change? Let me know if there’s anything we should be aware of or help unblock.

Mar 2 2026, 1:59 PM · Infrastructure-Foundations, MediaWiki-Platform-Team (Radar), OKR-Work, MW-Interfaces-Team
Vgutierrez updated the task description for T417253: Upgrade to HAProxy 3.0 on cache (bullseye) hosts.
Mar 2 2026, 10:55 AM · Traffic

Feb 27 2026

Vgutierrez added a comment to T417781: haproxy: strip x-wmf-* headers from responses.

could I suggest using a dedicated prefix for API/REST gateway headers? We already use internally X-WMF as a prefix on other layers that aren't related to API/REST gateways.

Feb 27 2026, 3:48 PM · User-Raine, Traffic, MediaWiki-Platform-Team (Radar), OKR-Work, MW-Interfaces-Team

Feb 26 2026

Vgutierrez updated the task description for T417253: Upgrade to HAProxy 3.0 on cache (bullseye) hosts.
Feb 26 2026, 2:54 PM · Traffic
Vgutierrez closed T418098: Requesting access to Superset for mikez as Resolved.

change has been merged, and it should be live by now

Feb 26 2026, 11:52 AM · Data-Engineering-Radar, Data-Engineering, SRE, SRE-Access-Requests
Vgutierrez updated the task description for T418098: Requesting access to Superset for mikez.
Feb 26 2026, 11:52 AM · Data-Engineering-Radar, Data-Engineering, SRE, SRE-Access-Requests

Feb 25 2026

Vgutierrez closed T418221: Requesting access to deployment for Eileen McFarland as Resolved.

change has been merged, please allow puppet to propagate the change, it could take up to 30 minutes

Feb 25 2026, 2:47 PM · SRE, SRE-Access-Requests
Vgutierrez updated the task description for T418221: Requesting access to deployment for Eileen McFarland.
Feb 25 2026, 2:16 PM · SRE, SRE-Access-Requests
Vgutierrez added a comment to T418221: Requesting access to deployment for Eileen McFarland.

SSH has been verified out-of-band

Feb 25 2026, 2:09 PM · SRE, SRE-Access-Requests
Vgutierrez updated subscribers of T418098: Requesting access to Superset for mikez.

got mcollins approval via Slack, we need Data-Engineering approval now (that's @Milimetric / @Ottomata)

Feb 25 2026, 1:16 PM · Data-Engineering-Radar, Data-Engineering, SRE, SRE-Access-Requests
Vgutierrez moved T418098: Requesting access to Superset for mikez from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.

waiting for mcollins approval, I've pinged them on Slack cause I've failed to find their phabricator user so far

Feb 25 2026, 9:30 AM · Data-Engineering-Radar, Data-Engineering, SRE, SRE-Access-Requests
Vgutierrez claimed T418098: Requesting access to Superset for mikez.
Feb 25 2026, 9:25 AM · Data-Engineering-Radar, Data-Engineering, SRE, SRE-Access-Requests

Feb 24 2026

Vgutierrez added a comment to T417864: haproxy: capture x-wmf-* headers in webrequest data set.

In the merged task, I was proposing not creating a new header, but instead adding to the existing X-Requestctl header using the same convention we use for haproxy applied rules, so something like X-Requestctl: rgw:${x-wmf-ratelimit-class}.

Feb 24 2026, 4:34 PM · Data-Engineering (Q3 FY25/26 January 1st - March 31th), Traffic, MediaWiki-Platform-Team (Radar), OKR-Work, MW-Interfaces-Team

Feb 19 2026

Vgutierrez claimed T417825: CDN: include x-is-browser in all requests to the backend.
Feb 19 2026, 12:24 PM · Infrastructure-Foundations, MediaWiki-Platform-Team (Radar), OKR-Work, MW-Interfaces-Team
Vgutierrez added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

blast radius is big.. I'm wondering if k8s nodes have workloads not exposed to the Internet where having a bigger MTU (thinking about jumbo frames here) could be beneficial or even required in performance terms.

Feb 19 2026, 10:05 AM · ServiceOps new, Patch-For-Review, Prod-Kubernetes, Kubernetes, Traffic

Feb 18 2026

Vgutierrez closed T417306: liberica-fp doesn't error/refuse to start if the detected MAC Address for the gateway is invalid as Resolved.

fixed by:

  • Retrying if we can't fetch the MAC address
  • Reporting the configured MAC address
  • Refusing to start if we can't fetch the MAC address after 10 attempts
  • Triggering an ARP resolution if the MAC address isn't on the kernel neighbors table
Feb 18 2026, 4:50 PM · Patch-For-Review, Traffic, Liberica
Vgutierrez updated the task description for T401832: Upgrade Traffic hosts to trixie.
Feb 18 2026, 2:54 PM · Traffic

Feb 17 2026

Vgutierrez added a comment to T417536: ATS causes git fetches from Gerrit to fail with 502 responses.

That is great thank you :-]

I am wondering though why reusing connections leads to errors. Apache has the default Debian configuration as far as I can tell:

# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On

# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100

# KeepAliveTimeout: Number of seconds to wait for the next request from the
# same client on the same connection.
#
KeepAliveTimeout 5

Can it be that ATS has a much longer timeout than the Apache 5 seconds timeout? If so i imagine:

  • ATS picks up a connection that it considers not having timed out (idling less than whatever timeout it has) but is about to expire on the Apache side (it has been idle for almost 5 seconds).
  • ATS sends the headers and at the same time, because the connection has been idling for 5 seconds Apache terminates it (tcp FIN?)
  • ATS gets the unexpected termination, gives up and emit a 502 response
Feb 17 2026, 5:42 PM · Patch-For-Review, ci-test-error (WMF-deployed Build Failure), Traffic, Gerrit, collaboration-services

Feb 16 2026

Vgutierrez added a comment to T417536: ATS causes git fetches from Gerrit to fail with 502 responses.

I managed to trigger this while capturing the traffic between ATS and gerrit2003, in my run it failed fetching https://gerrit.wikimedia.org/r/mediawiki/extensions/RelatedArticles, this is the content of the offending request:

Feb 16 2026, 5:57 PM · Patch-For-Review, ci-test-error (WMF-deployed Build Failure), Traffic, Gerrit, collaboration-services
Vgutierrez updated the task description for T401832: Upgrade Traffic hosts to trixie.
Feb 16 2026, 3:20 PM · Traffic
Vgutierrez added a subtask for T401832: Upgrade Traffic hosts to trixie: T417291: Upgrade the CDN to HAProxy 3.0.
Feb 16 2026, 9:21 AM · Traffic
Vgutierrez added a parent task for T417291: Upgrade the CDN to HAProxy 3.0: T401832: Upgrade Traffic hosts to trixie.
Feb 16 2026, 9:21 AM · Traffic

Feb 12 2026

Vgutierrez triaged T417306: liberica-fp doesn't error/refuse to start if the detected MAC Address for the gateway is invalid as High priority.
Feb 12 2026, 6:08 PM · Patch-For-Review, Traffic, Liberica
Vgutierrez created T417306: liberica-fp doesn't error/refuse to start if the detected MAC Address for the gateway is invalid.
Feb 12 2026, 6:08 PM · Patch-For-Review, Traffic, Liberica
Vgutierrez added a comment to T417291: Upgrade the CDN to HAProxy 3.0.

HAProxy 3.0 bumps to lua 5.4, as a consequence HAProxy fails to start cause lua5.4-maxminddb isn't there:

Feb 12 15:51:33 cp4052 haproxy[2379768]: [NOTICE]   (2379768) : haproxy version is 3.0.15-1~bpo11+1
Feb 12 15:51:33 cp4052 haproxy[2379768]: [NOTICE]   (2379768) : path to executable is /usr/sbin/haproxy
Feb 12 15:51:33 cp4052 haproxy[2379768]: [ALERT]    (2379768) : config : parsing [/etc/haproxy/haproxy.cfg:18] : Lua runtime error: /etc/haproxy/lua/maxmind-lookup.lua:3: module 'maxminddb' not found:
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no field package.preload['maxminddb']
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/etc/haproxy/lua/private/maxminddb.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/local/share/lua/5.4/maxminddb.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/local/share/lua/5.4/maxminddb/init.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/local/lib/lua/5.4/maxminddb.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/local/lib/lua/5.4/maxminddb/init.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/share/lua/5.4/maxminddb.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/share/lua/5.4/maxminddb/init.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file './maxminddb.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file './maxminddb/init.lua'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/local/lib/lua/5.4/maxminddb.so'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/lib/x86_64-linux-gnu/lua/5.4/maxminddb.so'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/lib/lua/5.4/maxminddb.so'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file '/usr/local/lib/lua/5.4/loadall.so'
Feb 12 15:51:33 cp4052 haproxy[2379768]:         no file './maxminddb.so'
Feb 12 15:51:33 cp4052 haproxy[2379768]: [ALERT]    (2379768) : config : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
Feb 12 15:51:33 cp4052 haproxy[2379768]: [ALERT]    (2379768) : config : parsing [/etc/haproxy/conf.d/tls.cfg:180] : error detected in proxy 'tls' while parsing 'http-request set-var(req.provenance,ifnotexists,ifnotempty)' rule : unknown fetch method 'lua.fetch_isp'.
Feb 12 15:51:33 cp4052 haproxy[2379768]: [ALERT]    (2379768) : config : Error(s) found in configuration file : /etc/haproxy/conf.d/tls.cfg
Feb 12 15:51:33 cp4052 systemd[1]: haproxy.service: Control process exited, code=exited, status=1/FAILURE
Feb 12 15:51:33 cp4052 systemd[1]: Reload failed for HAProxy Load Balancer.
Feb 12 2026, 3:54 PM · Traffic
Vgutierrez moved T417291: Upgrade the CDN to HAProxy 3.0 from Backlog to Actively Servicing on the Traffic board.
Feb 12 2026, 3:26 PM · Traffic
Vgutierrez triaged T417291: Upgrade the CDN to HAProxy 3.0 as Medium priority.
Feb 12 2026, 3:26 PM · Traffic
Vgutierrez created T417291: Upgrade the CDN to HAProxy 3.0.
Feb 12 2026, 3:26 PM · Traffic

Feb 5 2026

Vgutierrez moved T416444: 2026 Junos upgrade from Backlog to Radar/Not for Service on the Traffic board.
Feb 5 2026, 11:57 AM · Traffic, Infrastructure-Foundations, netops
Vgutierrez added a project to T416444: 2026 Junos upgrade: Traffic.
Feb 5 2026, 11:57 AM · Traffic, Infrastructure-Foundations, netops

Feb 3 2026

Vgutierrez added a comment to T392054: Add feature to select browser UA based on their age.

You just need to expose caniuse.com browser data to hiddenparma.

Feb 3 2026, 6:45 AM · Hiddenparma

Jan 23 2026

Vgutierrez merged task T414879: Unresponsive management for cp5022.mgmt:22 into T414411: cp5022 is unreachable.
Jan 23 2026, 4:04 PM · SRE, ops-eqsin
Vgutierrez merged T414879: Unresponsive management for cp5022.mgmt:22 into T414411: cp5022 is unreachable.
Jan 23 2026, 4:04 PM · SRE, DC-Ops, ops-eqsin, Traffic

Jan 21 2026

Vgutierrez added a comment to T352956: Handling inbound IPIP traffic on low traffic LVS k8s based realservers.

It's not clear why the net.ipv4.conf.default.rp_filter would need to change to 0 for IPIP and we would like to not do that on k8s nodes to prevent IP spoofing from inside containers (although that would require CAP_NET_ADMIN). If we could keep net.ipv4.conf.default.rp_filter=1 the dynamically created calico interfaces would still inherit that setting while net.ipv4.conf.all.rp_filter=0 would allow the ipip interfaces to also have rp_filter=0 set.

Jan 21 2026, 3:40 PM · ServiceOps new, Patch-For-Review, Prod-Kubernetes, Kubernetes, Traffic

Jan 19 2026

Vgutierrez added a comment to T414940: Handle httpd log surplus coming from Liberica.

oh I see... you're using https://gerrit.wikimedia.org/ in the healthcheck URL instead of https://healthcheck.wikimedia.org/varnish-fe this needs to be fixed

Jan 19 2026, 10:22 AM · Gerrit, collaboration-services
Vgutierrez added a comment to T414940: Handle httpd log surplus coming from Liberica.

hmm why these healthchecks from liberica are hitting the backend server in eqiad instead of staying in the cp nodes?

Jan 19 2026, 10:19 AM · Gerrit, collaboration-services

Jan 15 2026

Vgutierrez closed T414318: upgrade to HAProxy 2.8.18 as Resolved.
Jan 15 2026, 11:35 AM · Traffic
Vgutierrez triaged T414666: Support nft enabled realservers using IPIP encapsulation as Medium priority.
Jan 15 2026, 11:17 AM · Traffic, Liberica
Vgutierrez created T414666: Support nft enabled realservers using IPIP encapsulation.
Jan 15 2026, 11:17 AM · Traffic, Liberica

Jan 14 2026

Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 14 2026, 4:18 PM · Traffic
Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 14 2026, 3:16 PM · Traffic
Vgutierrez added a comment to T412396: Pass through information about the client from the CDN to MediaWiki to Logstash.

the headers described on https://wikitech.wikimedia.org/wiki/CDN/Backend_api and x-ja3n/x-ja4h should be hitting MediaWiki already

Jan 14 2026, 2:32 PM · SRE, MediaWiki-Platform-Team (Q3 Kanban Board), Traffic, MediaWiki-Debug-Logger
Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 14 2026, 2:26 PM · Traffic
Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 14 2026, 1:29 PM · Traffic

Jan 13 2026

Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 13 2026, 4:34 PM · Traffic
Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 13 2026, 2:43 PM · Traffic
Vgutierrez closed T400155: Reduce the chances of false positives on MSS clamping alerts as Resolved.
Jan 13 2026, 10:56 AM · Liberica, Traffic
Vgutierrez triaged T414411: cp5022 is unreachable as Medium priority.
Jan 13 2026, 9:14 AM · SRE, DC-Ops, ops-eqsin, Traffic
Vgutierrez created T414411: cp5022 is unreachable.
Jan 13 2026, 9:14 AM · SRE, DC-Ops, ops-eqsin, Traffic

Jan 12 2026

Vgutierrez updated the task description for T414318: upgrade to HAProxy 2.8.18.
Jan 12 2026, 10:35 AM · Traffic
Vgutierrez triaged T414318: upgrade to HAProxy 2.8.18 as Medium priority.
Jan 12 2026, 10:17 AM · Traffic
Vgutierrez created T414318: upgrade to HAProxy 2.8.18.
Jan 12 2026, 10:17 AM · Traffic

Dec 4 2025

Vgutierrez added a comment to T411781: lvs1018: remove cross-rack links to rows A, C and D.

the assessment is OK and the link can be removed safely

Dec 4 2025, 2:03 PM · DC-Ops, ops-eqiad, Infrastructure-Foundations, netops, SRE

Dec 3 2025

Vgutierrez triaged T411584: Refresh trafficserver_backend_requests_seconds histogram as Medium priority.
Dec 3 2025, 9:34 AM · Traffic
Vgutierrez assigned T411584: Refresh trafficserver_backend_requests_seconds histogram to CDobbins.
Dec 3 2025, 9:30 AM · Traffic
Vgutierrez created T411584: Refresh trafficserver_backend_requests_seconds histogram.
Dec 3 2025, 9:29 AM · Traffic

Dec 2 2025

Vgutierrez updated the task description for T411467: Let's Encrypt Decreasing Certificate Lifetimes to 45 Days.
Dec 2 2025, 10:18 AM · Acme-chief, Traffic
Vgutierrez triaged T411467: Let's Encrypt Decreasing Certificate Lifetimes to 45 Days as Medium priority.
Dec 2 2025, 9:58 AM · Acme-chief, Traffic
Vgutierrez created T411467: Let's Encrypt Decreasing Certificate Lifetimes to 45 Days.
Dec 2 2025, 9:58 AM · Acme-chief, Traffic