Page MenuHomePhabricator

aborrero (arturo)
Operations Engineer at Wikimedia Cloud Services Team

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Oct 23 2017, 12:19 PM (121 w, 2 d)
Availability
Available
IRC Nick
arturo
LDAP User
Arturo Borrero Gonzalez
MediaWiki User
ABorrero (WMF) [ Global Accounts ]

I'm Arturo Borrero Gonzalez from Spain (Seville). I'm Site Reliability Engineer (SRE) in the Wikimedia Cloud Services Team, a Wikimedia Foundation staff.

You may find me in some FLOSS projects, like Netfilter and Debian.

Recent Activity

Today

aborrero created T245606: CloudVPS: enable BGP in the neutron transport network.
Wed, Feb 19, 11:58 AM · netops, Operations, cloud-services-team (Kanban)
aborrero closed T245494: CloudVPS: figure out DNS zone ownership transfers and setup, a subtask of T245173: CloudVPS: DNS improvements, as Resolved.
Wed, Feb 19, 11:04 AM · cloud-services-team (Kanban), Epic
aborrero closed T245494: CloudVPS: figure out DNS zone ownership transfers and setup as Resolved.

Ok, thanks for dealing with this.

Wed, Feb 19, 11:04 AM · cloud-services-team (Kanban)

Yesterday

aborrero added a comment to T245494: CloudVPS: figure out DNS zone ownership transfers and setup.

I remember trying with either --sudo-project-id <project> and --os-project-id <project> in the cmdline with no luck.

Tue, Feb 18, 5:45 PM · cloud-services-team (Kanban)
JHedden awarded T245495: CloudVPS: IPv6 early PoC a Like token.
Tue, Feb 18, 2:10 PM · cloud-services-team (Kanban)
aborrero created T245495: CloudVPS: IPv6 early PoC.
Tue, Feb 18, 11:16 AM · cloud-services-team (Kanban)
aborrero updated the task description for T244727: CloudVPS: networking improvements.
Tue, Feb 18, 11:11 AM · cloud-services-team (Kanban), Epic
aborrero created T245494: CloudVPS: figure out DNS zone ownership transfers and setup.
Tue, Feb 18, 11:05 AM · cloud-services-team (Kanban)
aborrero added a comment to T168677: Add new Cloud Services domains to public suffix list.

The record for wmcloud.org was created:

Tue, Feb 18, 10:22 AM · Toolforge, Cloud-VPS, cloud-services-team (Kanban)

Mon, Feb 17

aborrero added a comment to T168677: Add new Cloud Services domains to public suffix list.

BTW: the pull request to update this upstream is: https://github.com/publicsuffix/list/pull/970

Mon, Feb 17, 7:06 PM · Toolforge, Cloud-VPS, cloud-services-team (Kanban)
aborrero updated subscribers of T168677: Add new Cloud Services domains to public suffix list.

I couldn't create the TXT record int he wmcloud.org zone, either using horizon or the cmdline:

Mon, Feb 17, 7:02 PM · Toolforge, Cloud-VPS, cloud-services-team (Kanban)
aborrero added a comment to T218570: DB planning: include a writeable (?) misc DB cluster in codfw for WMCS.

Ok, I think we can safely assume we are talking only about openstack databases: designate, glance, keystone, neutron, nova*.

Mon, Feb 17, 9:59 AM · DBA, cloud-services-team (Kanban)
aborrero added a parent task for T245180: Document and test failing over prometheus: T238096: Toolforge: prometheus: refresh setup.
Mon, Feb 17, 9:36 AM · Toolforge, cloud-services-team (Kanban)
aborrero added a subtask for T238096: Toolforge: prometheus: refresh setup: T245180: Document and test failing over prometheus.
Mon, Feb 17, 9:36 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero updated the task description for T236565: "tools" Cloud VPS project jessie deprecation.
Mon, Feb 17, 9:35 AM · cloud-services-team (Kanban), Toolforge, Cloud-VPS (Debian Jessie Deprecation)
aborrero added a comment to T218570: DB planning: include a writeable (?) misc DB cluster in codfw for WMCS.

Will m5 content be eventually migrated to those new hosts?

Mon, Feb 17, 9:30 AM · DBA, cloud-services-team (Kanban)

Fri, Feb 14

Bstorm awarded T245180: Document and test failing over prometheus a Love token.
Fri, Feb 14, 4:34 PM · Toolforge, cloud-services-team (Kanban)
aborrero closed T245180: Document and test failing over prometheus as Resolved.

I created a wikitech page:

Fri, Feb 14, 9:42 AM · Toolforge, cloud-services-team (Kanban)

Thu, Feb 13

aborrero closed T244851: Neutron: replace NAT customization with address scopes, a subtask of T244727: CloudVPS: networking improvements, as Declined.
Thu, Feb 13, 5:17 PM · cloud-services-team (Kanban), Epic
aborrero closed T244851: Neutron: replace NAT customization with address scopes as Declined.
Thu, Feb 13, 5:17 PM · cloud-services-team (Kanban)
aborrero reassigned T244986: cloudvirt1009: Device not healthy -SMART- from aborrero to Jclark-ctr.

General hard drive failure doesn't sound good. Please @Jclark-ctr @Cmjohnson advice how to proceed.

Thu, Feb 13, 5:16 PM · ops-eqiad, cloud-services-team (Hardware), Operations
aborrero added a comment to T244851: Neutron: replace NAT customization with address scopes.

The neutron router implements the address scope mechanism by looking at the input/output interface of packets. Since the networks we are interested in are external (truly physically external) to neutron, all packets circulate using the same output interface and thus the get applied the general NAT.
This would help us if neutron were implementing the address scope mechanism by evaluating source/destination address of packets, which is not the case.

Thu, Feb 13, 5:09 PM · cloud-services-team (Kanban)
aborrero updated the task description for T243766: Cloud DNS: proposal for new DNS service names.
Thu, Feb 13, 5:05 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero updated the task description for T245174: CloudVPS: automatically create per-project subdomain.
Thu, Feb 13, 5:03 PM · cloud-services-team (Kanban)
aborrero created T245174: CloudVPS: automatically create per-project subdomain.
Thu, Feb 13, 5:01 PM · cloud-services-team (Kanban)
aborrero added a parent task for T243766: Cloud DNS: proposal for new DNS service names: T245173: CloudVPS: DNS improvements.
Thu, Feb 13, 4:57 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero added a subtask for T245173: CloudVPS: DNS improvements: T243766: Cloud DNS: proposal for new DNS service names.
Thu, Feb 13, 4:57 PM · cloud-services-team (Kanban), Epic
aborrero created T245173: CloudVPS: DNS improvements.
Thu, Feb 13, 4:56 PM · cloud-services-team (Kanban), Epic
aborrero triaged T243766: Cloud DNS: proposal for new DNS service names as Medium priority.
Thu, Feb 13, 11:52 AM · Patch-For-Review, cloud-services-team (Kanban)
aborrero claimed T243766: Cloud DNS: proposal for new DNS service names.

This was accepted by the WMCS team.

Thu, Feb 13, 11:52 AM · Patch-For-Review, cloud-services-team (Kanban)
aborrero awarded T244954: Webservice shell repeatedly times out a Love token.
Thu, Feb 13, 11:12 AM · Kubernetes, cloud-services-team (Kanban), Toolforge

Wed, Feb 12

aborrero added a comment to T198479: labvirt1009 HP Raid alert.

Mention: T244986: cloudvirt1009: Device not healthy -SMART-

Wed, Feb 12, 4:20 PM · cloud-services-team (Kanban), Operations, ops-eqiad, DC-Ops
aborrero closed T244986: cloudvirt1009: Device not healthy -SMART- as Resolved.

Apparently the issue resolved itself:

Wed, Feb 12, 11:41 AM · ops-eqiad, cloud-services-team (Hardware), Operations
aborrero created T244986: cloudvirt1009: Device not healthy -SMART-.
Wed, Feb 12, 11:22 AM · ops-eqiad, cloud-services-team (Hardware), Operations
aborrero created M297: Neutron NAT.
Wed, Feb 12, 10:13 AM
aborrero added a comment to T244933: Re-organize hiera lookups for cloud-vps instances.

This task might be duplicated: T244222: CloudVPS: hiera refactor

Wed, Feb 12, 9:55 AM · Epic, cloud-services-team (Kanban)
aborrero added a subtask for T244222: CloudVPS: hiera refactor: T244933: Re-organize hiera lookups for cloud-vps instances.
Wed, Feb 12, 9:53 AM · Patch-For-Review, Epic, cloud-services-team (Kanban)
aborrero removed a subtask for T229441: CloudVPS: codfw1dev: missing bits: T244933: Re-organize hiera lookups for cloud-vps instances.
Wed, Feb 12, 9:53 AM · Epic, cloud-services-team (Kanban)
aborrero edited parent tasks for T244933: Re-organize hiera lookups for cloud-vps instances, added: T244222: CloudVPS: hiera refactor; removed: T229441: CloudVPS: codfw1dev: missing bits.
Wed, Feb 12, 9:53 AM · Epic, cloud-services-team (Kanban)

Tue, Feb 11

aborrero added a subtask for T244727: CloudVPS: networking improvements: T196116: VLAN tagging in Wikimedia Cloud.
Tue, Feb 11, 5:43 PM · cloud-services-team (Kanban), Epic
aborrero added a parent task for T196116: VLAN tagging in Wikimedia Cloud: T244727: CloudVPS: networking improvements.
Tue, Feb 11, 5:43 PM · cloud-services-team (Kanban), Cloud-Services
aborrero closed T215779: Tools prometheus can't talk to kubelet running on tools-worker as Declined.

This ticket refers to the legacy kubernetes cluster, which is in the process of being deprecated.

Tue, Feb 11, 5:10 PM · cloud-services-team (Kanban), Tools
aborrero updated the task description for T244851: Neutron: replace NAT customization with address scopes.
Tue, Feb 11, 12:42 PM · cloud-services-team (Kanban)
aborrero claimed T244851: Neutron: replace NAT customization with address scopes.
Tue, Feb 11, 12:41 PM · cloud-services-team (Kanban)
aborrero created T244851: Neutron: replace NAT customization with address scopes.
Tue, Feb 11, 12:39 PM · cloud-services-team (Kanban)
aborrero updated the task description for T244727: CloudVPS: networking improvements.
Tue, Feb 11, 12:35 PM · cloud-services-team (Kanban), Epic
aborrero added a comment to T239404: toolforge: new k8s: evaluate DNS (coredns) autoscale options.

Bumping this. I'm not sure if we are still interested in this.

Tue, Feb 11, 10:57 AM · Toolforge, cloud-services-team (Kanban), Kubernetes

Mon, Feb 10

aborrero updated the task description for T244727: CloudVPS: networking improvements.
Mon, Feb 10, 12:53 PM · cloud-services-team (Kanban), Epic
aborrero updated the task description for T244727: CloudVPS: networking improvements.
Mon, Feb 10, 12:10 PM · cloud-services-team (Kanban), Epic
aborrero added a comment to T244727: CloudVPS: networking improvements.

some related openstack docs: https://docs.openstack.org/liberty/networking-guide/scenario-classic-ovs.html

Mon, Feb 10, 12:09 PM · cloud-services-team (Kanban), Epic
aborrero triaged T244727: CloudVPS: networking improvements as Medium priority.
Mon, Feb 10, 12:08 PM · cloud-services-team (Kanban), Epic
aborrero created T244727: CloudVPS: networking improvements.
Mon, Feb 10, 12:07 PM · cloud-services-team (Kanban), Epic

Fri, Feb 7

aborrero closed T238096: Toolforge: prometheus: refresh setup, a subtask of T237643: toolforge: new k8s: figure out metrics / observability, as Resolved.
Fri, Feb 7, 10:56 AM · Patch-For-Review, Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero closed T238096: Toolforge: prometheus: refresh setup as Resolved.

Work here is done. Please reopen if required.

Fri, Feb 7, 10:56 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero closed T238820: CloudVPS: consider mirroring debian repos for openstack packages, a subtask of T241347: upgrade cloud-vps openstack to Openstack version 'Pike', as Resolved.
Fri, Feb 7, 9:50 AM · cloud-services-team (Kanban), Cloud-VPS
aborrero closed T238820: CloudVPS: consider mirroring debian repos for openstack packages, a subtask of T241348: Upgrade cloudservices nodes to openstack Pike, as Resolved.
Fri, Feb 7, 9:50 AM · Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T238820: CloudVPS: consider mirroring debian repos for openstack packages as Resolved.

This is done. Please reopen if required.

Fri, Feb 7, 9:50 AM · cloud-services-team (Kanban), Cloud-Services

Thu, Feb 6

aborrero added a comment to T244473: Toolforge: both domains in parallel and OAuth.

For the record, yes, we are aware of the change in the document root for webservices changing the URL scheme. This is a change that tool developers should change in their apps.

Thu, Feb 6, 11:30 AM · Toolforge, cloud-services-team (Kanban)
aborrero created T244473: Toolforge: both domains in parallel and OAuth.
Thu, Feb 6, 11:21 AM · Toolforge, cloud-services-team (Kanban)
aborrero added a comment to T234617: Toolforge. introduce new domain toolforge.org.

TODO: after last changes, requests to https://tools.wmflabs.org/ no longer redirect to the admin tool. The end in fourohfour domains for it to handle. We should consider adding some special handing for this case in fourohfour. CC @bd808

Thu, Feb 6, 10:47 AM · Goal, Toolforge, cloud-services-team (Kanban), Kubernetes

Wed, Feb 5

aborrero added a comment to T244222: CloudVPS: hiera refactor.

Proposal:

  • drop project/host hieradata from operations/puppet.git. Declare we have horizon for that. I doubt it is properly working anyway (for the same problem we can't use $::wmcs_deployment)
  • introduce per-deployment hierakeys in operations/puppet.git. But have the search hierarchy non-dependant on hiera lookups, but have it harcoded (we have different files for different deployments anyway)

This way we:

  • eliminate the potentially dangerous facts gathering
  • eliminate a duplicity in how hiera is stablished (horizon vs ops/puppet.git)
  • effectively support per-deployment hiera data, which is the ultimate goal we are after.

Will try to create a patch for this.

Wed, Feb 5, 1:11 PM · Patch-For-Review, Epic, cloud-services-team (Kanban)
aborrero added a comment to T244222: CloudVPS: hiera refactor.
  • drop project/host hieradata from operations/puppet.git. Declare we have horizon for that. I doubt it is properly working anyway (for the same problem we can't use $::wmcs_deployment)
  • introduce per-deployment hierakeys in operations/puppet.git. But have the search hierarchy non-dependant on hiera lookups, but have it harcoded (we have different files for different deployments anyway)
Wed, Feb 5, 11:11 AM · Patch-For-Review, Epic, cloud-services-team (Kanban)

Tue, Feb 4

aborrero added a comment to T244222: CloudVPS: hiera refactor.

Trying to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/569230 I discovered the following issue:

Tue, Feb 4, 12:47 PM · Patch-For-Review, Epic, cloud-services-team (Kanban)
aborrero created T244222: CloudVPS: hiera refactor.
Tue, Feb 4, 12:41 PM · Patch-For-Review, Epic, cloud-services-team (Kanban)

Mon, Feb 3

aborrero added a comment to T244111: request to build virtuoso-opensource-7 debian package for buster.

I tried installing in Buster the package from Experimental as @Epantaleo suggested. APT was unable to install the package due to missing dependencies, but I checked and this is the kind of problem that can be solved by building the package directly in the target release (backporting). This is what I did, for future reference.

Mon, Feb 3, 9:56 AM

Fri, Jan 31

aborrero added a comment to T238096: Toolforge: prometheus: refresh setup.

Does this include updates to the toolforge prometheus instance? I wanted to funnel PAWS metrics there but it was such an old version when I tried it I gave up on getting it to work.

Fri, Jan 31, 1:15 PM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero updated the task description for T238096: Toolforge: prometheus: refresh setup.
Fri, Jan 31, 12:12 PM · Toolforge, cloud-services-team (Kanban), Kubernetes

Thu, Jan 30

Chicocvenancio awarded T238096: Toolforge: prometheus: refresh setup a Love token.
Thu, Jan 30, 4:57 PM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero updated the task description for T238096: Toolforge: prometheus: refresh setup.
Thu, Jan 30, 10:19 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero claimed T238096: Toolforge: prometheus: refresh setup.
Thu, Jan 30, 10:18 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero moved T243936: labtestpuppetmaster2001 showing Failed in Netbox from Inbox to Watching on the cloud-services-team (Kanban) board.
Thu, Jan 30, 9:55 AM · cloud-services-team (Kanban)
aborrero assigned T243936: labtestpuppetmaster2001 showing Failed in Netbox to Papaul.

The server is still online and serving requests as usual.
We have a project to introduce a replacement in this task: T242607: Create in-cloud puppetmaster for codfw1dev

Thu, Jan 30, 9:55 AM · cloud-services-team (Kanban)

Wed, Jan 29

aborrero closed T243556: Fix internal TLD in use in codfw1dev, a subtask of T242607: Create in-cloud puppetmaster for codfw1dev, as Resolved.
Wed, Jan 29, 5:45 PM · Epic, cloud-services-team (Kanban)
aborrero closed T243556: Fix internal TLD in use in codfw1dev as Resolved.

Closing task now, feel free to reopen if required.

Wed, Jan 29, 5:45 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero added a comment to T243556: Fix internal TLD in use in codfw1dev.

Things are far better now:

Wed, Jan 29, 5:45 PM · Cloud-VPS, cloud-services-team (Kanban)

Tue, Jan 28

aborrero added a comment to T243556: Fix internal TLD in use in codfw1dev.

Ok, I got to this point:

Tue, Jan 28, 6:11 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T243831: toolforge: gridengine: a case of apparently orphaned jobs running (jarbot) as Invalid.

I discovered there are many tools with same name (jarbot-ii, jarbot-iii) doing apparently the same thing. No orphan procs, but actual tools!

Tue, Jan 28, 1:47 PM · cloud-services-team (Kanban)
aborrero created T243833: CloudVPS: cumin key might be misconfigured somewhere.
Tue, Jan 28, 1:18 PM · cloud-services-team (Kanban)
aborrero triaged T243831: toolforge: gridengine: a case of apparently orphaned jobs running (jarbot) as Medium priority.
Tue, Jan 28, 1:05 PM · cloud-services-team (Kanban)
aborrero added a comment to T243831: toolforge: gridengine: a case of apparently orphaned jobs running (jarbot).

I used this to try detecting the procs:

Tue, Jan 28, 1:02 PM · cloud-services-team (Kanban)
aborrero renamed T243831: toolforge: gridengine: a case of apparently orphaned jobs running (jarbot) from toolforge: gridengine: a case of apaprently orphaned jobs running (jarbot) to toolforge: gridengine: a case of apparently orphaned jobs running (jarbot).
Tue, Jan 28, 12:46 PM · cloud-services-team (Kanban)
aborrero created T243831: toolforge: gridengine: a case of apparently orphaned jobs running (jarbot).
Tue, Jan 28, 12:45 PM · cloud-services-team (Kanban)
aborrero added a comment to T243766: Cloud DNS: proposal for new DNS service names.

I just discovered that codfw1dev-ns0.wikimedia.org exists indeed.

Tue, Jan 28, 10:55 AM · Patch-For-Review, cloud-services-team (Kanban)
aborrero updated the task description for T243766: Cloud DNS: proposal for new DNS service names.
Tue, Jan 28, 10:54 AM · Patch-For-Review, cloud-services-team (Kanban)

Mon, Jan 27

aborrero moved T243766: Cloud DNS: proposal for new DNS service names from Inbox to Needs discussion on the cloud-services-team (Kanban) board.
Mon, Jan 27, 1:38 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero created T243766: Cloud DNS: proposal for new DNS service names.
Mon, Jan 27, 1:35 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero added a comment to T243556: Fix internal TLD in use in codfw1dev.

+1 for cloudinfra.

Mon, Jan 27, 12:02 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero added a project to T243734: X-Wikimedia-Debug header does nothing on Toolforge web services: cloud-services-team (Kanban).
Mon, Jan 27, 11:41 AM · cloud-services-team (Kanban), Toolforge
aborrero added a comment to T243734: X-Wikimedia-Debug header does nothing on Toolforge web services.

@russblau could you please describe what behavior would you like to see / you we expecting?

Mon, Jan 27, 11:40 AM · cloud-services-team (Kanban), Toolforge
aborrero added a comment to T243734: X-Wikimedia-Debug header does nothing on Toolforge web services.

Both error page handling (T103662) and this debug setup is under review. This is affected by our current effort to introduce the new kubernetes cluster into Toolforge, which has a new ingress mechanism and new error page handling mechanism.
You can read more here: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Networking_and_ingress
Moreover, when we finally introduce the new domain toolforge.org (something we plan to do this quarter), error handling and debugging options may need to be reworked completely on our side.

Mon, Jan 27, 11:27 AM · cloud-services-team (Kanban), Toolforge
aborrero added a subtask for T103662: Urlproxy (Toolforge front proxy) should not overwrite error page: T243734: X-Wikimedia-Debug header does nothing on Toolforge web services.
Mon, Jan 27, 11:09 AM · cloud-services-team (Kanban), Regression, Toolforge, Cloud-VPS
aborrero added a parent task for T243734: X-Wikimedia-Debug header does nothing on Toolforge web services: T103662: Urlproxy (Toolforge front proxy) should not overwrite error page.
Mon, Jan 27, 11:09 AM · cloud-services-team (Kanban), Toolforge

Fri, Jan 24

aborrero triaged T243556: Fix internal TLD in use in codfw1dev as Medium priority.
Fri, Jan 24, 6:09 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero added a comment to T243556: Fix internal TLD in use in codfw1dev.

I had a look a this today.

Fri, Jan 24, 5:59 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero moved T241494: Degraded RAID on cloudvirt1014 from Backlog to Hardware faults on the cloud-services-team (Hardware) board.
Fri, Jan 24, 1:15 PM · Patch-For-Review, cloud-services-team (Hardware), ops-eqiad, Operations
aborrero edited projects for T241494: Degraded RAID on cloudvirt1014, added: cloud-services-team (Hardware); removed cloud-services-team (Kanban).
Fri, Jan 24, 1:15 PM · Patch-For-Review, cloud-services-team (Hardware), ops-eqiad, Operations
aborrero moved T241494: Degraded RAID on cloudvirt1014 from Inbox to Watching on the cloud-services-team (Kanban) board.
Fri, Jan 24, 1:14 PM · Patch-For-Review, cloud-services-team (Hardware), ops-eqiad, Operations
aborrero added a project to T241494: Degraded RAID on cloudvirt1014: cloud-services-team (Kanban).
Fri, Jan 24, 1:14 PM · Patch-For-Review, cloud-services-team (Hardware), ops-eqiad, Operations
aborrero added a comment to T241494: Degraded RAID on cloudvirt1014.

BTW this server has active workloads (pooled) at the moment. Please @Jclark-ctr coordinate with WMCS before shutting server down.

Fri, Jan 24, 1:01 PM · Patch-For-Review, cloud-services-team (Hardware), ops-eqiad, Operations
aborrero added a comment to T241494: Degraded RAID on cloudvirt1014.

@Jclark-ctr I believe this server may need the BBU checked/replaced, but I may be wrong.

Fri, Jan 24, 1:01 PM · Patch-For-Review, cloud-services-team (Hardware), ops-eqiad, Operations