Page MenuHomePhabricator

aborrero (arturo)
SRE at Wikimedia Cloud Services Team

Projects (7)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Oct 23 2017, 12:19 PM (175 w, 3 d)
Availability
Available
IRC Nick
arturo
LDAP User
Arturo Borrero Gonzalez
MediaWiki User
ABorrero (WMF) [ Global Accounts ]

I'm Arturo Borrero Gonzalez from Spain (Seville). I'm Site Reliability Engineer (SRE) in the Wikimedia Cloud Services Team, a Wikimedia Foundation staff.

You may find me in some FLOSS projects, like Netfilter and Debian.

Recent Activity

Yesterday

aborrero closed T276449: openstack: failures with VM live migration as Invalid.

Closing as invalid for now, see T276449#6883228

Thu, Mar 4, 4:12 PM · cloud-services-team (Kanban)
aborrero closed T276453: openstack: failed neutron port binding during live migrations, a subtask of T276449: openstack: failures with VM live migration, as Invalid.
Thu, Mar 4, 4:10 PM · cloud-services-team (Kanban)
aborrero closed T276453: openstack: failed neutron port binding during live migrations as Invalid.

Closing as invalid for now, see T276449#6883228

Thu, Mar 4, 4:10 PM · cloud-services-team (Kanban)
aborrero closed T276451: openstack: live migration issue: nova conductor key exception, a subtask of T276449: openstack: failures with VM live migration, as Invalid.
Thu, Mar 4, 4:10 PM · cloud-services-team (Kanban)
aborrero closed T276451: openstack: live migration issue: nova conductor key exception as Invalid.

Closing as invalid for now, see T276449#6883228

Thu, Mar 4, 4:10 PM · cloud-services-team (Kanban)
aborrero updated subscribers of T276449: openstack: failures with VM live migration.
NOTE: after leaving a VM live migrate for a good 25 minutes, it completed (mwoffliner5). I'm therefore suspecting that all the problems are mostly caused by me not being patient enough, cancelling the script, halting the migration process and leaving the VMs in an inconsistent MIGRATING state (which then triggers the issues detailed in the child tasks).
Thu, Mar 4, 3:53 PM · cloud-services-team (Kanban)
aborrero created T276453: openstack: failed neutron port binding during live migrations.
Thu, Mar 4, 12:45 PM · cloud-services-team (Kanban)
aborrero added a parent task for T276208: cloud: libvirt doesn't support live migration when using nested KVM: T276449: openstack: failures with VM live migration.
Thu, Mar 4, 12:41 PM · cloud-services-team (Kanban), Cloud-VPS
aborrero added a subtask for T276449: openstack: failures with VM live migration: T276208: cloud: libvirt doesn't support live migration when using nested KVM.
Thu, Mar 4, 12:40 PM · cloud-services-team (Kanban)
aborrero created T276451: openstack: live migration issue: nova conductor key exception.
Thu, Mar 4, 12:40 PM · cloud-services-team (Kanban)
aborrero created T276449: openstack: failures with VM live migration.
Thu, Mar 4, 12:38 PM · cloud-services-team (Kanban)

Wed, Mar 3

aborrero added a comment to T276327: cloud: puppetmasters: adopt cinder volumes to store certs and git repos.

Does this imply multiattach support? I've investigated that a bit but haven't seen it work yet.

Wed, Mar 3, 3:42 PM · cloud-services-team (Kanban)
aborrero triaged T276327: cloud: puppetmasters: adopt cinder volumes to store certs and git repos as Medium priority.
Wed, Mar 3, 1:38 PM · cloud-services-team (Kanban)
aborrero created T276327: cloud: puppetmasters: adopt cinder volumes to store certs and git repos.
Wed, Mar 3, 1:37 PM · cloud-services-team (Kanban)
aborrero added a comment to T268393: UDP traffic throughput to instances in the "meet" Cloud VPS project not meeting expectations.

[..]
I put the full error in P14517 (I redacted as much as possible, let me know if you want the original data). There's no error or warning beforehand or afterwards, it's just healthchecks (and all okay) but suddenly it starts dropping packets.

Do we have some sort of ratelimit in the cloud infra that might have affected this? Or is it behind CloudFlare and it's being aggressive?

Wed, Mar 3, 11:43 AM · cloud-services-team (Kanban), Cloud-VPS, Wikimedia Meet
aborrero closed T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem as Resolved.

ok, updating the firmware-bnx2x + upgrading the kernel did the trick apparently.

Wed, Mar 3, 10:07 AM · cloud-services-team (Hardware), SRE, ops-eqiad
aborrero added a comment to T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem.

Rebooting cloudnet1003 into the new kernel failed to bring interfaces up:

Wed, Mar 3, 9:54 AM · cloud-services-team (Hardware), SRE, ops-eqiad

Tue, Mar 2

aborrero added a comment to T276208: cloud: libvirt doesn't support live migration when using nested KVM.

From my experience with the wmcs-drain-hypervisor.py script today, I think we can improve our workflows a bit:

Tue, Mar 2, 3:54 PM · cloud-services-team (Kanban), Cloud-VPS
Bstorm awarded T276208: cloud: libvirt doesn't support live migration when using nested KVM a Heartbreak token.
Tue, Mar 2, 2:57 PM · cloud-services-team (Kanban), Cloud-VPS
aborrero added a comment to T276208: cloud: libvirt doesn't support live migration when using nested KVM.

VMs with the VMX cpu flag, barring some unreachable VMs:

Tue, Mar 2, 11:54 AM · cloud-services-team (Kanban), Cloud-VPS
aborrero updated the task description for T276208: cloud: libvirt doesn't support live migration when using nested KVM.
Tue, Mar 2, 11:01 AM · cloud-services-team (Kanban), Cloud-VPS
aborrero triaged T276208: cloud: libvirt doesn't support live migration when using nested KVM as High priority.
Tue, Mar 2, 10:35 AM · cloud-services-team (Kanban), Cloud-VPS
aborrero placed T276208: cloud: libvirt doesn't support live migration when using nested KVM up for grabs.
Tue, Mar 2, 10:35 AM · cloud-services-team (Kanban), Cloud-VPS
aborrero created T276208: cloud: libvirt doesn't support live migration when using nested KVM.
Tue, Mar 2, 10:34 AM · cloud-services-team (Kanban), Cloud-VPS
aborrero closed T276040: Puppet failure on toolsbeta-bastion-05.toolsbeta.eqiad1.wikimedia.cloud, a subtask of T275865: Toolforge: migrate bastions to Debian Buster, as Invalid.
Tue, Mar 2, 9:53 AM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
aborrero closed T276040: Puppet failure on toolsbeta-bastion-05.toolsbeta.eqiad1.wikimedia.cloud as Invalid.

We can safely ignore these errors, the VM is short lived, for testing purposes.

Tue, Mar 2, 9:53 AM · cloud-services-team (Kanban)

Fri, Feb 26

aborrero triaged T275865: Toolforge: migrate bastions to Debian Buster as Medium priority.
Fri, Feb 26, 11:37 AM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
aborrero created T275865: Toolforge: migrate bastions to Debian Buster.
Fri, Feb 26, 11:37 AM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
aborrero triaged T275864: Toolforge: migrate to Debian Buster as Medium priority.
Fri, Feb 26, 11:36 AM · Toolforge, cloud-services-team (Kanban), Epic
aborrero created T275864: Toolforge: migrate to Debian Buster.
Fri, Feb 26, 11:36 AM · Toolforge, cloud-services-team (Kanban), Epic
aborrero reassigned T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem from Jclark-ctr to MoritzMuehlenhoff.

I don't see errors anymore on cloudnet1004 after the kernel upgrade. I think we should do the same in the other server (cloudnet1003).

Fri, Feb 26, 9:30 AM · cloud-services-team (Hardware), SRE, ops-eqiad

Thu, Feb 25

aborrero closed T272963: cloudgw: develop HA setup, a subtask of T270704: cloud: introduce new edge network architecture for eqiad1 and codfw1dev, as Resolved.
Thu, Feb 25, 3:34 PM · cloud-services-team (Kanban)
aborrero closed T272963: cloudgw: develop HA setup as Resolved.
Thu, Feb 25, 3:34 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero closed T275483: neutron: introduce a mechanism for setting arbitrary sysctl on netns creating, a subtask of T268335: cloud: neutron l3 agent: improve failover handling, as Resolved.
Thu, Feb 25, 2:57 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero closed T275483: neutron: introduce a mechanism for setting arbitrary sysctl on netns creating as Resolved.
Thu, Feb 25, 2:57 PM · cloud-services-team (Kanban)

Wed, Feb 24

aborrero updated the task description for T275605: cloudmetrics1002: mysterious issue.
Wed, Feb 24, 11:39 AM · cloud-services-team (Kanban)
aborrero added a parent task for T275605: cloudmetrics1002: mysterious issue: T165784: rack/setup/install labmon1002.
Wed, Feb 24, 11:39 AM · cloud-services-team (Kanban)
aborrero added a subtask for T165784: rack/setup/install labmon1002: T275605: cloudmetrics1002: mysterious issue.
Wed, Feb 24, 11:39 AM · cloud-services-team (Kanban), Cloud-Services, SRE
aborrero created T275605: cloudmetrics1002: mysterious issue.
Wed, Feb 24, 11:37 AM · cloud-services-team (Kanban)
aborrero created T275599: debmonitor: returns proxy error when user is in too many groups.
Wed, Feb 24, 9:28 AM · SRE-tools
aborrero added a comment to T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem.

Looking good so far. I'll wait another couple days before drawing more conclusions.

Wed, Feb 24, 9:25 AM · cloud-services-team (Hardware), SRE, ops-eqiad

Tue, Feb 23

aborrero closed T272397: cloud: drop NAT exception for dumps NFS as Resolved.

This is done, we merged the puppet change, plus:

  • we don't reload neutron-l3-agent on puppet changes (on purpose, to avoid failover noise)
  • in cloudnet servers manually delete iptables rules that were previously implementing the NAT exception
Tue, Feb 23, 5:00 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero closed T272397: cloud: drop NAT exception for dumps NFS, a subtask of T272395: Cloud: reduce NAT exceptions from cloud to production, as Resolved.
Tue, Feb 23, 4:59 PM · cloud-services-team (Kanban), Epic
aborrero created T275483: neutron: introduce a mechanism for setting arbitrary sysctl on netns creating.
Tue, Feb 23, 11:05 AM · cloud-services-team (Kanban)
aborrero added a comment to T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem.

Ok, now running the new kernel, will leave it running at least a couple of days and see what happens:

Tue, Feb 23, 10:56 AM · cloud-services-team (Hardware), SRE, ops-eqiad
aborrero triaged T274801: cookbook: tracebacks instead of showing proper error message when running without proper perms as Lowest priority.

BTW, this is just a cosmetic thing, so I'm changing priority to 'lowest' :-)

Tue, Feb 23, 10:53 AM · SRE-tools
aborrero closed T274139: Decide TLS auth proxy method for the new toolforge jobs framework as Resolved.

This is solved by now:

Tue, Feb 23, 10:46 AM · cloud-services-team (Kanban), Toolforge
aborrero triaged T275478: toolsbeta: ingress admission controller doesn't accept valid FQDN patterns as Medium priority.
Tue, Feb 23, 10:46 AM · cloud-services-team (Kanban), Toolforge
aborrero closed T274139: Decide TLS auth proxy method for the new toolforge jobs framework, a subtask of T251917: Design the Jobs service in k8s, as Resolved.
Tue, Feb 23, 10:45 AM · cloud-services-team (Kanban), Toolforge
aborrero renamed T275478: toolsbeta: ingress admission controller doesn't accept valid FQDN patterns from toolsbeta: ingress admission controller doesn't accept valid domains to toolsbeta: ingress admission controller doesn't accept valid FQDN patterns.
Tue, Feb 23, 10:43 AM · cloud-services-team (Kanban), Toolforge
aborrero created T275478: toolsbeta: ingress admission controller doesn't accept valid FQDN patterns.
Tue, Feb 23, 10:43 AM · cloud-services-team (Kanban), Toolforge

Fri, Feb 19

aborrero added a comment to T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem.

Unfortunately the server still shows the same problems, and even self-rebooted over night.

Fri, Feb 19, 10:13 AM · cloud-services-team (Hardware), SRE, ops-eqiad

Thu, Feb 18

aborrero added a comment to T274801: cookbook: tracebacks instead of showing proper error message when running without proper perms.

My expectation would be something like:

Thu, Feb 18, 4:01 PM · SRE-tools
aborrero added a comment to T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem.

Rebooted the server for a clean start, I see the driver and firmware being loaded. I'll leave here the output for reference:

aborrero@cloudnet1004:~ $ sudo dmesg -T | grep bnx2x
[Thu Feb 18 14:52:35 2021] bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 (2014/02/10)
[Thu Feb 18 14:52:35 2021] bnx2x 0000:04:00.0: msix capability found
[Thu Feb 18 14:52:35 2021] bnx2x 0000:04:00.0: part number 394D4342-31383735-31543030-47303030
[Thu Feb 18 14:52:36 2021] bnx2x 0000:04:00.0: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
[Thu Feb 18 14:52:36 2021] bnx2x 0000:04:00.1: msix capability found
[Thu Feb 18 14:52:36 2021] bnx2x 0000:04:00.1: part number 394D4342-31383735-31543030-47303030
[Thu Feb 18 14:52:37 2021] bnx2x 0000:04:00.1: 32.000 Gb/s available PCIe bandwidth (5 GT/s x8 link)
[Thu Feb 18 14:52:37 2021] bnx2x 0000:04:00.0 eno49: renamed from eth0
[Thu Feb 18 14:52:38 2021] bnx2x 0000:04:00.1 eno50: renamed from eth1
[Thu Feb 18 14:52:49 2021] bnx2x 0000:04:00.0: firmware: direct-loading firmware bnx2x/bnx2x-e2-7.13.1.0.fw
[Thu Feb 18 14:52:50 2021] bnx2x 0000:04:00.0 eno49: using MSI-X  IRQs: sp 53  fp[0] 55 ... fp[7] 62
[Thu Feb 18 14:52:50 2021] bnx2x 0000:04:00.0 eno49: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
[Thu Feb 18 14:52:57 2021] bnx2x 0000:04:00.1 eno50: using MSI-X  IRQs: sp 64  fp[0] 66 ... fp[7] 75
[Thu Feb 18 14:52:58 2021] bnx2x 0000:04:00.1 eno50: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit
Thu, Feb 18, 2:57 PM · cloud-services-team (Hardware), SRE, ops-eqiad
aborrero added a comment to T274801: cookbook: tracebacks instead of showing proper error message when running without proper perms.

Again, I'm not familiar with the codebase, but what about simple try: when calling the cookbook backend, catch that particular error and print something more friendly.

Thu, Feb 18, 1:36 PM · SRE-tools
aborrero moved T275129: cloudgw: linux kernel >= 5.6 highly convenient from Inbox to Watching on the cloud-services-team (Kanban) board.
Thu, Feb 18, 12:42 PM · SRE, cloud-services-team (Kanban)
aborrero created T275129: cloudgw: linux kernel >= 5.6 highly convenient.
Thu, Feb 18, 12:34 PM · SRE, cloud-services-team (Kanban)

Wed, Feb 17

aborrero added a comment to T272397: cloud: drop NAT exception for dumps NFS.

We decided to:

  • send a brief heads up message to community mailing lists about this change
  • schedule an operation window next week (probably 2020-02-23)
Wed, Feb 17, 4:52 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero added a comment to T274801: cookbook: tracebacks instead of showing proper error message when running without proper perms.

I don't know the codebase, but perhaps showing the help message before configuring the log is an "easy" fix for this particular case.

Wed, Feb 17, 4:29 PM · SRE-tools
aborrero added a comment to T268335: cloud: neutron l3 agent: improve failover handling.

HINT: activate conntrack_tcp_be_liberal in the neutron netns.

Wed, Feb 17, 12:45 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero added a comment to T272963: cloudgw: develop HA setup.

This is in very good shape. I tested several failover scenarios:

  • manually stop keepalived in the primary VRRP node
  • reboot of the primary VRRP node
  • flapping (backup -> primary -> backup -> primary)
Wed, Feb 17, 12:37 PM · Patch-For-Review, cloud-services-team (Kanban)

Tue, Feb 16

aborrero updated the task description for T274871: cluebotng: IRC freenode activity causes flood warnings, but there is a way to stop that.
Tue, Feb 16, 11:22 AM · cloud-services-team (Kanban), Tools
aborrero updated the task description for T274871: cluebotng: IRC freenode activity causes flood warnings, but there is a way to stop that.
Tue, Feb 16, 11:07 AM · cloud-services-team (Kanban), Tools
aborrero updated subscribers of T274871: cluebotng: IRC freenode activity causes flood warnings, but there is a way to stop that.
Tue, Feb 16, 11:04 AM · cloud-services-team (Kanban), Tools
aborrero triaged T274871: cluebotng: IRC freenode activity causes flood warnings, but there is a way to stop that as High priority.
Tue, Feb 16, 10:59 AM · cloud-services-team (Kanban), Tools
aborrero updated the task description for T274871: cluebotng: IRC freenode activity causes flood warnings, but there is a way to stop that.
Tue, Feb 16, 10:58 AM · cloud-services-team (Kanban), Tools
aborrero created T274871: cluebotng: IRC freenode activity causes flood warnings, but there is a way to stop that.
Tue, Feb 16, 10:57 AM · cloud-services-team (Kanban), Tools

Mon, Feb 15

aborrero created T274801: cookbook: tracebacks instead of showing proper error message when running without proper perms.
Mon, Feb 15, 4:01 PM · SRE-tools
aborrero created T274782: PCC jobs running on compiler1001.puppet-diffs.eqiad.wmflabs fails because disk is full.
Mon, Feb 15, 12:51 PM · puppet-compiler, Release-Engineering-Team
aborrero added a comment to T272397: cloud: drop NAT exception for dumps NFS.

The change is ready. I will coordinate with @Bstorm for an operation window soon.

Mon, Feb 15, 12:11 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero awarded T209953: Use lookup() instead of hiera() in Puppet a Burninate token.
Mon, Feb 15, 10:14 AM · Patch-For-Review, Epic, SRE, cloud-services-team (Kanban)

Thu, Feb 11

aborrero raised the priority of T273598: openstack train: permission bug in neutron-linuxbridge-agent from Low to High.
Thu, Feb 11, 5:48 PM · cloud-services-team (Kanban)
aborrero moved T268621: Move some of wikimediacloud.org 185.15.56.0/23 to Netbox from Soon! to Watching on the cloud-services-team (Kanban) board.
Thu, Feb 11, 5:46 PM · cloud-services-team (Kanban), Traffic, SRE, DNS, netbox
aborrero closed T245230: Investigate cpu/ram requests and limits for DaemonSets pods as Declined.

We decided to decline this task in the backlog grooming meeting.

Thu, Feb 11, 5:42 PM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero moved T236399: Upgrade mariadb on toolsdb servers to 10.1.44 from Soon! to Inbox on the cloud-services-team (Kanban) board.
Thu, Feb 11, 5:41 PM · Data-Services, cloud-services-team (Kanban), Tools
aborrero closed T143639: Write a simple script that handles failovering proxies (or move behind HA proxy!) as Resolved.

At this point, we have a PoC over in the PAWS project. If we can demonstrate good behavior in a quick failover test, we could start to deploy the functionality more widely. There's a small routing issue to handle first brought up in T257534

Thu, Feb 11, 5:40 PM · Sustainability (Incident Followup), cloud-services-team (Kanban), Cloud-Services
aborrero closed T192156: Review encoding of all OpenStack databases as Invalid.

No longer valid.

Thu, Feb 11, 5:37 PM · cloud-services-team (Kanban), Cloud-VPS
aborrero raised the priority of T151704: Freenode sometimes throttles bot connections from tools from Low to Medium.
Thu, Feb 11, 5:36 PM · Patch-For-Review, cloud-services-team (Kanban), wikimedia-irc-freenode, Toolforge
aborrero triaged T272114: Replace all disk-usage flavor variants with Cinder use (was: Cinder storage vs. ephemeral storage vs. flavor) as Medium priority.
Thu, Feb 11, 5:32 PM · Patch-For-Review, cloud-services-team (Kanban), Cloud-VPS
aborrero triaged T271862: Update toolserver.org redirects to use toolforge.org as Low priority.
Thu, Feb 11, 5:32 PM · Tools, Toolforge, cloud-services-team (Kanban)
aborrero triaged T272117: Clean up deprecation warnings in OpenStack logs as Medium priority.
Thu, Feb 11, 5:30 PM · Patch-For-Review, cloud-services-team (Kanban), Cloud-VPS
aborrero triaged T266915: "Unable to query nbdime API" error as Low priority.
Thu, Feb 11, 5:30 PM · cloud-services-team (Kanban), PAWS
aborrero closed T272566: cloud: review and cleanup unused puppet code, a subtask of T272559: Unused puppet resources audit, early 2021, as Resolved.
Thu, Feb 11, 5:29 PM · Patch-For-Review, SRE, Puppet
aborrero closed T272566: cloud: review and cleanup unused puppet code as Resolved.

Solved by https://gerrit.wikimedia.org/r/c/operations/puppet/+/663027

Thu, Feb 11, 5:29 PM · cloud-services-team (Kanban)
aborrero triaged T272795: Upload file to download.wmcloud.org as Low priority.
Thu, Feb 11, 5:27 PM · cloud-services-team (Kanban)
aborrero triaged T272905: Reduce privs of metrics pods where we can as Low priority.
Thu, Feb 11, 5:26 PM · Toolforge, cloud-services-team (Kanban)
aborrero triaged T273150: OpenStack services should use system users to talk to Keystone as Low priority.
Thu, Feb 11, 5:25 PM · cloud-services-team (Kanban)
aborrero closed T273706: puppet compiler hiera errors on hosts under .wikimedia.cloud as Resolved.

We believe this is done.

Thu, Feb 11, 5:24 PM · cloud-services-team (Kanban)
aborrero triaged T273792: [cloudvirt] Enable and test jumbo frames to ceph osds as Medium priority.
Thu, Feb 11, 5:24 PM · cloud-services-team (Kanban)
aborrero triaged T273794: [ceph] Upgrade to Octopus 15.2.8 as Medium priority.
Thu, Feb 11, 5:21 PM · cloud-services-team (Kanban)
aborrero triaged T273808: DeferredUpdates: Deferred update 'AtomicSectionUpdate_EchoNotificationMapper::insert' failed to run. as Medium priority.
Thu, Feb 11, 5:20 PM · wikitech.wikimedia.org, cloud-services-team (Kanban), Growth-Team, Notifications, Wikimedia-production-error
aborrero triaged T273896: labs/toollabs fails debian-glue-unstable for lintian errors caused by the config as Medium priority.
Thu, Feb 11, 5:19 PM · cloud-services-team (Kanban), Continuous-Integration-Config
aborrero triaged T273959: cloud: monitor/alert on health of TLS certs used on shared front proxy setup as Medium priority.
Thu, Feb 11, 5:18 PM · cloud-services-team (Kanban)
aborrero triaged T274208: Upload Wikipedia corpora to download.wmcloud.org as Medium priority.
Thu, Feb 11, 5:15 PM · Cloud-Services, cloud-services-team (Kanban)
aborrero triaged T274344: Restore access to the tool discordwiki on Toolforge as Medium priority.
Thu, Feb 11, 5:14 PM · cloud-services-team (Kanban), Toolforge
aborrero assigned T274385: rework novaadmin and novaobserver project memberships to Andrew.
Thu, Feb 11, 5:13 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero triaged T274386: Allow end to end encryption through the shared web proxy as Low priority.

Untested idea for multi-hop:

--- i/modules/dynamicproxy/templates/domainproxy.conf
+++ w/modules/dynamicproxy/templates/domainproxy.conf
@@ -148,6 +148,11 @@ server {
         proxy_pass $backend;
         proxy_set_header Host $vhost;

+        # T274386: allow unverified TLS to upstream when $backend starts with
+        # `https://` to trigger TLS usage by `proxy_pass`.
+        proxy_ssl_verify off;
+        proxy_ssl_session_reuse on;
+
         proxy_set_header X-Forwarded-Proto $scheme;

         proxy_http_version 1.1;

This would also need changes in the data retrieved via domainproxy.lua to allow it to return an https:// prefixed upstream route when appropriate. We may also want to add proxy_ssl_protocols and proxy_ssl_ciphers settings in this block to tune the "internal" TLS connection configuration.

Thu, Feb 11, 5:12 PM · cloud-services-team (Kanban), Cloud-VPS
aborrero added a comment to T272397: cloud: drop NAT exception for dumps NFS.
Thu, Feb 11, 4:49 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero reassigned T267654: (Need By: TBD) rack/setup/install cloudnet2004-dev from aborrero to RobH.

I have no idea, perhaps we should ask @RobH who created this task. This sever wasn't in my radar before this task.

Thu, Feb 11, 4:38 PM · cloud-services-team (Hardware), ops-codfw, DC-Ops, SRE
aborrero added a comment to T272397: cloud: drop NAT exception for dumps NFS.

It seems we send the client address to the NFS server?

Thu, Feb 11, 4:10 PM · Patch-For-Review, cloud-services-team (Kanban)