Page MenuHomePhabricator

ema (Emanuele Rocca)
Senior Site Reliability Engineer, Traffic Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Sep 29 2015, 8:49 PM (190 w, 2 d)
Availability
Available
IRC Nick
ema
LDAP User
Ema
MediaWiki User
Unknown

Recent Activity

Wed, May 15

ema added a comment to T223336: [Regression] fatal-errors.php action=segfault results in a 503 error under php7-fpm..

Please provide the full responses, including headers, returned by the HHVM and PHP7 origin servers.

Wed, May 15, 10:43 AM · Performance-Team (Radar), serviceops, User-jijiki, observability, Operations, PHP 7.2 support
ema added a comment to T162035: Some PNG thumbnails and JPEG originals delivered as [text/html] content-type and hence not rendered in browser.

Actually no, we did fix the issue at the Swift layer (T162348), hence we removed the workaround from Varnish: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/348699/. That means that there is nothing wrong with ATS.

Wed, May 15, 10:16 AM · Patch-For-Review, Traffic, Operations, media-storage
ema added a comment to T162035: Some PNG thumbnails and JPEG originals delivered as [text/html] content-type and hence not rendered in browser.

The issue is indeed reproducible again, affecting ATS hosts.

Wed, May 15, 10:06 AM · Patch-For-Review, Traffic, Operations, media-storage
ema reopened T162035: Some PNG thumbnails and JPEG originals delivered as [text/html] content-type and hence not rendered in browser as "Open".
Wed, May 15, 10:04 AM · Patch-For-Review, Traffic, Operations, media-storage

Mon, May 13

ema created P8522 traffic_ctl config reload changes permissions of records.config.
Mon, May 13, 3:29 PM

Fri, May 10

ema created P8505 ensure_max_age swift-proxy filter testing.
Fri, May 10, 6:07 AM
ema moved T222705: Improve Pybal's url checks from Triage to LoadBalancer on the Traffic board.
Fri, May 10, 6:02 AM · Patch-For-Review, User-jijiki, PHP 7.2 support, Operations, serviceops, Traffic
ema moved T222937: Replace Varnish backends with ATS on cache upload nodes in esams from Triage to Caching on the Traffic board.
Fri, May 10, 6:02 AM · Patch-For-Review, Operations, Traffic
ema triaged T222937: Replace Varnish backends with ATS on cache upload nodes in esams as Normal priority.
Fri, May 10, 5:37 AM · Patch-For-Review, Operations, Traffic
ema created T222937: Replace Varnish backends with ATS on cache upload nodes in esams.
Fri, May 10, 5:37 AM · Patch-For-Review, Operations, Traffic

Thu, May 9

ema created P8500 (An Untitled Masterwork).
Thu, May 9, 1:58 PM
ema closed T221977: Package libvmod-uuid for Debian as Resolved.

@ema since the pkg has been uploaded, are we now good here? Ok to resolve the task or is there something else that needs to be done?

Thu, May 9, 5:41 AM · Patch-For-Review, Services (watching), Core Platform Team Backlog (Watching / External), Traffic, Operations
ema closed T221977: Package libvmod-uuid for Debian, a subtask of T221976: Have Varnish set the `X-Request-Id` header for incoming external requests, as Resolved.
Thu, May 9, 5:41 AM · Operations, Core Platform Team Backlog (Next), Services (next), Traffic

Tue, May 7

ema closed T222459: cp2009 down and mgmt console not reachable as Resolved.

IPMI seems to be working remotely:

Tue, May 7, 1:11 PM · Traffic, ops-codfw, Operations
ema committed rOSVU20dca87b759a: Initial packaging (authored by ema).
Initial packaging
Tue, May 7, 10:32 AM
ema created P8481 prometheus-trafficserver-exporter.deb.diff.
Tue, May 7, 10:13 AM
Gerrit Code Review <gerrit@wikimedia.org> committed rOSVU9816c9921169: Modify access rules (authored by ema).
Modify access rules
Tue, May 7, 9:52 AM
Gerrit Code Review <gerrit@wikimedia.org> committed rOSVU735a06d4b1fc: Modified project settings (authored by ema).
Modified project settings
Tue, May 7, 9:52 AM
ema committed rOSVUac34ac100954: Initial packaging (authored by ema).
Initial packaging
Tue, May 7, 9:52 AM
ema closed T219967: Replace Varnish backends with ATS on cache upload nodes in ulsfo as Resolved.

All Varnish backends in ulsfo upload replaced with ATS.

Tue, May 7, 9:49 AM · Patch-For-Review, Operations, Goal, Traffic
ema added a comment to T222642: false positives in check_trafficserver_config_status.

I've ack'ed the warnings in Icinga for the time being.

Tue, May 7, 8:53 AM · Operations, Traffic
ema moved T222459: cp2009 down and mgmt console not reachable from Triage to Hardware on the Traffic board.
Tue, May 7, 8:49 AM · Traffic, ops-codfw, Operations
ema added a comment to T222620: cp1083 crashed.

Interestingly, there was a memory usage spike right before the host crashed.

Tue, May 7, 8:42 AM · Operations, ops-eqiad, Traffic
ema updated the task description for T222620: cp1083 crashed.
Tue, May 7, 8:32 AM · Operations, ops-eqiad, Traffic
ema updated the task description for T222620: cp1083 crashed.
Tue, May 7, 8:31 AM · Operations, ops-eqiad, Traffic
ema moved T222620: cp1083 crashed from Triage to Hardware on the Traffic board.
Tue, May 7, 8:21 AM · Operations, ops-eqiad, Traffic

Mon, May 6

ema triaged T222620: cp1083 crashed as Normal priority.
Mon, May 6, 2:48 PM · Operations, ops-eqiad, Traffic
ema created T222620: cp1083 crashed.
Mon, May 6, 2:47 PM · Operations, ops-eqiad, Traffic
ema created P8477 (An Untitled Masterwork).
Mon, May 6, 10:43 AM

Fri, May 3

ema created P8470 htc-status-1.log.
Fri, May 3, 6:33 AM
ema created P8469 varnish-http-format-error.log.
Fri, May 3, 6:13 AM

Thu, May 2

ema added a comment to T196066: Add prometheus metrics for varnishkafka instances running on caching hosts.

Hm, sorry for this probably too late idea...but would it be worth building a C based prometheus plugin for librdkafka and/or varnishkafka instead or parsing this JSON file?

Thu, May 2, 2:51 PM · Patch-For-Review, Analytics-Kanban, Traffic, Analytics, Operations
ema closed T222071: SwiftMedia URL rewrite returns some 404s with wrong Content-Length as Resolved.

This is now fixed, CL matches the actual body length:

Thu, May 2, 12:27 PM · Patch-For-Review, Performance-Team, Operations, Traffic, Thumbor

Tue, Apr 30

ema moved T222097: scan external ranges with current Nessus rulesets from Triage to Watching on the Traffic board.
Tue, Apr 30, 11:47 AM · Operations, Traffic, Security-Team
ema updated the task description for T221784: Puppet failing without Icinga alert in case of dependency cycle.
Tue, Apr 30, 11:18 AM · Puppet, Icinga, observability, Operations

Mon, Apr 29

ema moved T220567: Wikitech page views sometimes default to MobileFrontend from Triage to Caching on the Traffic board.
Mon, Apr 29, 3:53 PM · Traffic, wikitech.wikimedia.org, Operations
ema moved T221290: wiki-mail DKIM failing from Triage to DNS Names on the Traffic board.
Mon, Apr 29, 3:53 PM · Patch-For-Review, Traffic, Operations, DNS, Mail
ema moved T221288: Phabricator SPF record contains internal addressing for phab[12]001 from Triage to DNS Names on the Traffic board.
Mon, Apr 29, 3:53 PM · Patch-For-Review, Traffic, Operations, DNS, Mail
ema moved T221268: Remove old letsencrypt puppet module from Triage to TLS on the Traffic board.
Mon, Apr 29, 3:53 PM · Puppet, Patch-For-Review, Operations, Traffic
ema moved T221976: Have Varnish set the `X-Request-Id` header for incoming external requests from Triage to Caching on the Traffic board.
Mon, Apr 29, 3:52 PM · Operations, Core Platform Team Backlog (Next), Services (next), Traffic
ema moved T221977: Package libvmod-uuid for Debian from Triage to Caching on the Traffic board.
Mon, Apr 29, 3:52 PM · Patch-For-Review, Services (watching), Core Platform Team Backlog (Watching / External), Traffic, Operations
ema moved T222071: SwiftMedia URL rewrite returns some 404s with wrong Content-Length from Triage to Watching on the Traffic board.
Mon, Apr 29, 3:52 PM · Patch-For-Review, Performance-Team, Operations, Traffic, Thumbor
ema triaged T221976: Have Varnish set the `X-Request-Id` header for incoming external requests as Normal priority.
Mon, Apr 29, 3:52 PM · Operations, Core Platform Team Backlog (Next), Services (next), Traffic
ema triaged T222071: SwiftMedia URL rewrite returns some 404s with wrong Content-Length as Normal priority.
Mon, Apr 29, 1:50 PM · Patch-For-Review, Performance-Team, Operations, Traffic, Thumbor
ema created T222071: SwiftMedia URL rewrite returns some 404s with wrong Content-Length.
Mon, Apr 29, 1:50 PM · Patch-For-Review, Performance-Team, Operations, Traffic, Thumbor
ema added a comment to T128188: Make CI run Varnish VCL tests.

VTC tests can now be run from dev workstations against PCC:

Mon, Apr 29, 11:15 AM · Varnish, Patch-For-Review, Operations, Continuous-Integration-Infrastructure, Traffic

Sat, Apr 27

ema created P8449 (An Untitled Masterwork).
Sat, Apr 27, 12:53 PM
ema created P8448 pccvtc.py.
Sat, Apr 27, 12:12 PM

Fri, Apr 26

ema created P8447 (An Untitled Masterwork).
Fri, Apr 26, 2:02 PM
ema created P8443 libvmod-uuid-packaging.diff.
Fri, Apr 26, 8:47 AM

Wed, Apr 24

ema triaged T221784: Puppet failing without Icinga alert in case of dependency cycle as Normal priority.
Wed, Apr 24, 3:43 PM · Puppet, Icinga, observability, Operations
ema created T221784: Puppet failing without Icinga alert in case of dependency cycle.
Wed, Apr 24, 3:43 PM · Puppet, Icinga, observability, Operations
ema closed T221731: cp4021 - UNKNOWN: cannot run varnishstat as Resolved.
Wed, Apr 24, 10:03 AM · Patch-For-Review, Operations, Traffic
ema added a comment to T221731: cp4021 - UNKNOWN: cannot run varnishstat.

Indeed our Varnish mailbox lag Icinga check only applies to Varnish backends, given that backends are those affected by T145661 and similar issues. During the Puppet refactoring splitting frontend/backend puppetization (T219967) I forgot to move the check from the Varnish module, where it shouldn't have been in the first place, to the backend profile. Doing this will ensure that the check is only added to cache hosts using Varnish as the cache backend software, not those using ATS such as cp4021.

Wed, Apr 24, 8:57 AM · Patch-For-Review, Operations, Traffic

Apr 23 2019

ema closed T221454: Puppet broken on two VMs in the 'traffic' project as Resolved.

Fixed the former, deleted the latter. Thanks for the reminder!

Apr 23 2019, 1:36 PM · Operations, Traffic
ema triaged T221454: Puppet broken on two VMs in the 'traffic' project as Normal priority.
Apr 23 2019, 1:34 PM · Operations, Traffic

Apr 18 2019

ema created P8417 (An Untitled Masterwork).
Apr 18 2019, 8:47 AM
ema closed T220510: Removal of If-Cached VCL support as Resolved.
Apr 18 2019, 6:54 AM · Patch-For-Review, Traffic, Operations

Apr 17 2019

ema moved T220190: Make UrlShortener 404s cacheable from Triage to Caching on the Traffic board.
Apr 17 2019, 1:45 PM · MW-1.33-notes (1.33.0-wmf.25; 2019-04-09), Traffic, User-Ladsgroup, Operations, MediaWiki-extensions-UrlShortener
ema moved T220510: Removal of If-Cached VCL support from Triage to Caching on the Traffic board.
Apr 17 2019, 1:45 PM · Patch-For-Review, Traffic, Operations
ema renamed T221217: Allow running several ATS instances on the same server from Allow running several ATS instances in the same server to Allow running several ATS instances on the same server.
Apr 17 2019, 1:41 PM · Patch-For-Review, Operations, Traffic

Apr 16 2019

ema closed T220591: cergen: exceptions trying to add alt_name as Resolved.

@Ottomata thanks! The new error message is helpful, and the proposed solution works.

Apr 16 2019, 2:28 PM · Analytics-Kanban, Patch-For-Review, Analytics, Operations
ema closed T213263: Partial cache_upload traffic switchover to ATS and switchback to Varnish as Resolved.

Switchback completed, closing.

Apr 16 2019, 9:44 AM · Patch-For-Review, Operations, Traffic

Apr 11 2019

ema created P8390 (An Untitled Masterwork).
Apr 11 2019, 5:19 PM
ema created P8389 traffic-upload-stretch.traffic.eqiad.wmflabs.
Apr 11 2019, 5:08 PM

Apr 10 2019

ema triaged T220591: cergen: exceptions trying to add alt_name as Normal priority.
Apr 10 2019, 10:34 AM · Analytics-Kanban, Patch-For-Review, Analytics, Operations
ema created T220591: cergen: exceptions trying to add alt_name .
Apr 10 2019, 10:34 AM · Analytics-Kanban, Patch-For-Review, Analytics, Operations
ema created P8383 (An Untitled Masterwork).
Apr 10 2019, 10:03 AM
ema created P8382 (An Untitled Masterwork).
Apr 10 2019, 10:00 AM
ema created P8381 (An Untitled Masterwork).
Apr 10 2019, 9:50 AM
ema created P8380 (An Untitled Masterwork).
Apr 10 2019, 9:39 AM

Apr 9 2019

ema triaged T220510: Removal of If-Cached VCL support as Normal priority.
Apr 9 2019, 2:28 PM · Patch-For-Review, Traffic, Operations
ema created T220510: Removal of If-Cached VCL support.
Apr 9 2019, 2:28 PM · Patch-For-Review, Traffic, Operations
ema added a comment to T209707: tagged_interface sometimes exceeds IFNAMSIZ.

So with newer systemd I think there are good chances enp59s0f0 will be named ens2f0 and enp175s0f0 ens3f1

Apr 9 2019, 11:06 AM · Traffic, Operations

Apr 8 2019

chasemp awarded Blog Post: Switching production traffic to Apache Traffic Server a Orange Medal token.
Apr 8 2019, 7:00 PM · Traffic
ema created P8368 (An Untitled Masterwork).
Apr 8 2019, 1:14 PM
ema created P8367 (An Untitled Masterwork).
Apr 8 2019, 1:04 PM
ema triaged T219986: Shortened URLs won't redirect when there's data as Normal priority.
Apr 8 2019, 11:36 AM · User-Ladsgroup, Patch-For-Review, Traffic, Operations, MediaWiki-extensions-UrlShortener
ema triaged T220022: Some load.php requests failing due to "ERR_SPDY_PROTOCOL_ERROR 200" as Normal priority.
Apr 8 2019, 11:36 AM · Performance-Team (Radar), Traffic, Operations
ema triaged T220190: Make UrlShortener 404s cacheable as Normal priority.
Apr 8 2019, 11:34 AM · MW-1.33-notes (1.33.0-wmf.25; 2019-04-09), Traffic, User-Ladsgroup, Operations, MediaWiki-extensions-UrlShortener

Apr 5 2019

ema edited P8351 (An Untitled Masterwork).
Apr 5 2019, 2:28 PM
Krinkle awarded Blog Post: Switching production traffic to Apache Traffic Server a Love token.
Apr 5 2019, 1:02 PM · Traffic
Ladsgroup awarded Blog Post: Switching production traffic to Apache Traffic Server a Love token.
Apr 5 2019, 11:32 AM · Traffic
ema edited P8351 (An Untitled Masterwork).
Apr 5 2019, 8:40 AM
ema edited P8351 (An Untitled Masterwork).
Apr 5 2019, 8:11 AM
ema created P8351 (An Untitled Masterwork).
Apr 5 2019, 7:46 AM

Apr 4 2019

ema added a comment to T213263: Partial cache_upload traffic switchover to ATS and switchback to Varnish.

Here is how our custom ATS errors look like.

Apr 4 2019, 12:32 PM · Patch-For-Review, Operations, Traffic

Apr 3 2019

ema created P8338 (An Untitled Masterwork).
Apr 3 2019, 2:58 PM
ema updated the task description for T219978: Make phame cacheable.
Apr 3 2019, 2:40 PM · Operations, Traffic, Phabricator
ema moved T219867: contact Wikivoyage e. V. and figure out status of wikivoyage-old.org / fix or park broken domain from Triage to DNS Names on the Traffic board.
Apr 3 2019, 12:52 PM · Patch-For-Review, Operations, serviceops, Domains, Traffic
ema moved T219129: Allow directing a percentage of API traffic to PHP7 from Triage to Caching on the Traffic board.
Apr 3 2019, 12:50 PM · User-jijiki, Traffic, Operations, serviceops
ema moved T216681: Enable nginx prometheus metrics for all elastic nodes from Triage to General on the Traffic board.
Apr 3 2019, 12:24 PM · Discovery-Search, Traffic, Patch-For-Review, Elasticsearch, Operations
ema moved T219967: Replace Varnish backends with ATS on cache upload nodes in ulsfo from Triage to Caching on the Traffic board.
Apr 3 2019, 12:21 PM · Patch-For-Review, Operations, Goal, Traffic
ema moved T219978: Make phame cacheable from Triage to Caching on the Traffic board.
Apr 3 2019, 12:21 PM · Operations, Traffic, Phabricator
ema updated the task description for T219977: Add support for temporary chroots to boron.
Apr 3 2019, 11:00 AM · Operations
ema triaged T219978: Make phame cacheable as Normal priority.
Apr 3 2019, 10:53 AM · Operations, Traffic, Phabricator
ema created T219978: Make phame cacheable.
Apr 3 2019, 10:53 AM · Operations, Traffic, Phabricator
ema triaged T219977: Add support for temporary chroots to boron as Normal priority.
Apr 3 2019, 10:37 AM · Operations
ema created T219977: Add support for temporary chroots to boron.
Apr 3 2019, 10:36 AM · Operations
ema triaged T219967: Replace Varnish backends with ATS on cache upload nodes in ulsfo as Normal priority.
Apr 3 2019, 8:15 AM · Patch-For-Review, Operations, Goal, Traffic