Page MenuHomePhabricator

aborrero (arturo)
Operations Engineer at Wikimedia Cloud Services Team

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Oct 23 2017, 12:19 PM (87 w, 7 h)
Availability
Available
IRC Nick
arturo
LDAP User
Arturo Borrero Gonzalez
MediaWiki User
ABorrero (WMF) [ Global Accounts ]

I'm Arturo Borrero Gonzalez from Spain (Seville). I'm Operations Engineer as part of the Wikimedia Cloud Services Team, a Wikimedia Foundation staff.

You may found me in some FLOSS projects, like Netfilter and Debian.

Recent Activity

Today

aborrero added a comment to T215531: Deploy upgraded Kubernetes to toolsbeta.

Related T177393: Implement authentication/authorization in Kubernetes clusters

Mon, Jun 24, 3:42 PM · Epic, Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero added a comment to T224188: rack/setup/install (3) new osd ceph nodes.

That sounds reasonable for the PoC, depending on rack space. @faidon for the last word.

Note that we don't have visibility in the cross virtual chassis links. Adding it to LibreNMS is possible but would require dev time.

Mon, Jun 24, 12:14 PM · ops-eqiad, Operations, cloud-services-team (Kanban), Cloud-Services
aborrero added a comment to T215531: Deploy upgraded Kubernetes to toolsbeta.

Ok, now that I have a working etcd setup for k8s, I will follow up with kubernetes itself.

Mon, Jun 24, 12:09 PM · Epic, Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero updated the task description for T216733: cloudvirts: ensure we're running the latest raid controller firmware.
Mon, Jun 24, 12:03 PM · Wikimedia-Incident, cloud-services-team (Kanban), Cloud-VPS
aborrero added a comment to T224188: rack/setup/install (3) new osd ceph nodes.

Worth noting that even though we will be using 10G links, we don't expect them to be fully used in any case in the short term.

Mon, Jun 24, 11:14 AM · ops-eqiad, Operations, cloud-services-team (Kanban), Cloud-Services

Fri, Jun 21

aborrero closed T226098: Toolforge: modernize deployment for etcd in k8s as Resolved.

I'm pretty happy now with how etcd looks in the puppet tree and the resulting state with fresh installed VMs. I will probably leave it as is and move on to kubernetes itself and see if anything else is required once k8s is actually using etcd.

Fri, Jun 21, 1:08 PM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
aborrero closed T226098: Toolforge: modernize deployment for etcd in k8s, a subtask of T215531: Deploy upgraded Kubernetes to toolsbeta, as Resolved.
Fri, Jun 21, 1:08 PM · Epic, Toolforge, cloud-services-team (Kanban), Kubernetes
Qgil awarded T225082: Request creation of Blog VPS project a Love token.
Fri, Jun 21, 12:07 PM · Cloud-VPS (Project-requests)
aborrero closed T225082: Request creation of Blog VPS project as Resolved.

I just created this project.

Fri, Jun 21, 11:55 AM · Cloud-VPS (Project-requests)
aborrero updated subscribers of T226098: Toolforge: modernize deployment for etcd in k8s.

OK, after some help from @Joe and @Vgutierrez I got the cluster working:

Fri, Jun 21, 10:06 AM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)

Thu, Jun 20

aborrero added a comment to T226098: Toolforge: modernize deployment for etcd in k8s.

I still don't find why etcd would do that.

Thu, Jun 20, 4:22 PM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
aborrero added a comment to T226098: Toolforge: modernize deployment for etcd in k8s.

Using puppet certs for etcd has some other tricky parts, like:

Thu, Jun 20, 1:22 PM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)

Wed, Jun 19

Bstorm awarded T216132: CloudVPS: create wmcs-vm-fsck script a Burninate token.
Wed, Jun 19, 4:46 PM · Wikimedia-Incident, cloud-services-team (Kanban)
aborrero added a comment to T216132: CloudVPS: create wmcs-vm-fsck script.

We totally forgot about this.

Wed, Jun 19, 3:39 PM · Wikimedia-Incident, cloud-services-team (Kanban)
aborrero triaged T226098: Toolforge: modernize deployment for etcd in k8s as Normal priority.
Wed, Jun 19, 12:08 PM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
aborrero created T226098: Toolforge: modernize deployment for etcd in k8s.
Wed, Jun 19, 12:08 PM · Patch-For-Review, Toolforge, cloud-services-team (Kanban)
Qgil awarded T225081: Request creation of Discuss VPS project a Love token.
Wed, Jun 19, 12:07 PM · Cloud-VPS (Project-requests)
aborrero created T226095: etcd: listen-peer-urls only supports IP addresses and no FQDNs.
Wed, Jun 19, 11:50 AM · cloud-services-team (Kanban), Operations
aborrero closed T225303: sssd apparently not working fine in toolsbeta, a subtask of T217280: LDAP server running out of memory frequently and disrupting Cloud VPS clients, as Resolved.
Wed, Jun 19, 11:07 AM · cloud-services-team (Kanban), Patch-For-Review, Operations, Cloud-VPS, LDAP, Toolforge
aborrero closed T225303: sssd apparently not working fine in toolsbeta as Resolved.

That was it. I was missing in the toolsbeta.admin group.

Wed, Jun 19, 11:07 AM · cloud-services-team (Kanban), LDAP
aborrero closed T225081: Request creation of Discuss VPS project as Resolved.

Ok, this works for us as well. For the record, the new project was approved yesterday in the WMCS team meeting. Will close the task now, as I understand you will be using a VM in the discourse project [0].
Feel free to reopen if required.

Wed, Jun 19, 10:03 AM · Cloud-VPS (Project-requests)

Mon, Jun 17

ema awarded T221212: spicerack/cookbook: add additional arguments IRC/SAL logging a Love token.
Mon, Jun 17, 2:33 PM · Patch-For-Review, Operations-Software-Development, Operations

Thu, Jun 13

Andrew awarded T204840: wikitech-static: not synced a Party Time token.
Thu, Jun 13, 12:35 PM · cloud-services-team (Kanban), wikitech.wikimedia.org

Tue, Jun 11

aborrero triaged T225484: cloudvirt servers: SSL certificate expiring as Normal priority.

I just read the puppet code at modules/openstack/manifests/nova/compute/service.pp. I wonder if we could switch to automatically generated certs instead.

Tue, Jun 11, 9:07 AM · cloud-services-team (Kanban)
aborrero updated the task description for T225484: cloudvirt servers: SSL certificate expiring.
Tue, Jun 11, 9:00 AM · cloud-services-team (Kanban)
aborrero created T225484: cloudvirt servers: SSL certificate expiring.
Tue, Jun 11, 8:58 AM · cloud-services-team (Kanban)

Fri, Jun 7

aborrero added a comment to T225303: sssd apparently not working fine in toolsbeta.

Just confirmed using this command that toolsbeta is the only CloudVPS project with that special config. We may just drop the special case and move on.

Fri, Jun 7, 6:10 PM · cloud-services-team (Kanban), LDAP
aborrero added a comment to T225303: sssd apparently not working fine in toolsbeta.

If I compare the config in toolforge servers, I see:

Fri, Jun 7, 5:50 PM · cloud-services-team (Kanban), LDAP
aborrero created T225303: sssd apparently not working fine in toolsbeta.
Fri, Jun 7, 1:30 PM · cloud-services-team (Kanban), LDAP

Wed, Jun 5

aborrero added a comment to T215678: Replace each of the custom controllers with something in a new Toolforge Kubernetes setup.

Hey, I'm just catching up. I wasn't aware of the existence of this phab task, and followed all your same steps until I got here with your very same conclusions :-)

Wed, Jun 5, 1:03 PM · Patch-For-Review, Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero merged task T224273: Toolforge: develop new k8s cluster in toolsbeta into T215531: Deploy upgraded Kubernetes to toolsbeta.
Wed, Jun 5, 11:43 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero merged T224273: Toolforge: develop new k8s cluster in toolsbeta into T215531: Deploy upgraded Kubernetes to toolsbeta.
Wed, Jun 5, 11:43 AM · Epic, Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero claimed T215531: Deploy upgraded Kubernetes to toolsbeta.

I will be doing some stuff related to this task, so I'm claiming it.

Wed, Jun 5, 11:40 AM · Epic, Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero moved T224558: sssd: support for Debian Jessie from Doing to Important on the cloud-services-team (Kanban) board.
Wed, Jun 5, 11:32 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero closed T224743: cloudservices: unify puppet roles/profiles as Resolved.
Wed, Jun 5, 11:22 AM · cloud-services-team (Kanban)
aborrero triaged T225081: Request creation of Discuss VPS project as Normal priority.

Just setting expectations: new project requests are discussed in the WMCS team meeting every week on Tue. Next week, however, we won't have the team meeting due to the SRE offsite in Dublin. The next WMCS team meeting is expected to happen on 2019-06-18.

Wed, Jun 5, 11:08 AM · Cloud-VPS (Project-requests)
aborrero triaged T225067: labtestvirt2003: test different power management / CPU setups for faster kvm as Normal priority.
Wed, Jun 5, 9:27 AM · Continuous-Integration-Infrastructure, cloud-services-team (Kanban)

Tue, Jun 4

herron awarded T221212: spicerack/cookbook: add additional arguments IRC/SAL logging a Like token.
Tue, Jun 4, 6:21 PM · Patch-For-Review, Operations-Software-Development, Operations
aborrero added a comment to T224743: cloudservices: unify puppet roles/profiles.

@aborrero digging into the OpenStack configuration a bit more and realized that we're not using active/active services. I'd like to chat about the HA architecture and previous decisions made when you have some time.

Tue, Jun 4, 2:53 PM · cloud-services-team (Kanban)
aborrero triaged T224981: rabbitmq: connectivity issues between cloudservices1004 and rabbitmq as Low priority.
Tue, Jun 4, 12:21 PM · cloud-services-team (Kanban)
aborrero created T224981: rabbitmq: connectivity issues between cloudservices1004 and rabbitmq.
Tue, Jun 4, 12:20 PM · cloud-services-team (Kanban)
aborrero created T224977: puppet-catalog-compiler: compilation result randomly places servers in the 'failed' section.
Tue, Jun 4, 11:30 AM · Operations, puppet-compiler
aborrero closed T224424: cloudservices1003: gateway timeout error, a subtask of T221770: Upgrade cloucontrol1003/1004 to stretch/mitaka, as Resolved.
Tue, Jun 4, 11:28 AM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T224424: cloudservices1003: gateway timeout error as Resolved.

I just checked, and the errors are gone. Closing task now.

Tue, Jun 4, 11:28 AM · Cloud-VPS, cloud-services-team (Kanban)

Mon, Jun 3

aborrero added a comment to T224743: cloudservices: unify puppet roles/profiles.

@aborrero Do we need to merge the roles or can we remove the primary secondary roles and profiles?

role::wmcs::openstack::eqiad1::services_primary
role::wmcs::openstack::eqiad1::services_secondary
profile::openstack::eqiad1::pdns::recursor::primary
profile::openstack::eqiad1::pdns::recursor::secondary
Mon, Jun 3, 2:08 PM · cloud-services-team (Kanban)
aborrero added a comment to T224743: cloudservices: unify puppet roles/profiles.

Apparently, the last missing bit to be able to merge both roles is:

Mon, Jun 3, 1:08 PM · cloud-services-team (Kanban)
aborrero closed T224877: prometheus-pdns-exporter: add stretch support as Resolved.

prometheus-pdns-rec-exporter should be available for Stretch, it's used on the production recursors, which are on Stretch:
https://debmonitor.wikimedia.org/packages/prometheus-pdns-rec-exporter

Mon, Jun 3, 1:05 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T224877: prometheus-pdns-exporter: add stretch support, a subtask of T221769: Upgrade cloudservices1003/1004 to stretch/mitaka, as Resolved.
Mon, Jun 3, 1:05 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero updated the task description for T224877: prometheus-pdns-exporter: add stretch support.
Mon, Jun 3, 12:03 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero created T224877: prometheus-pdns-exporter: add stretch support.
Mon, Jun 3, 11:58 AM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero added a comment to T221769: Upgrade cloudservices1003/1004 to stretch/mitaka.

We will be upgrading cloudservices1003 to stretch today.

Mon, Jun 3, 10:07 AM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)

Fri, May 31

aborrero triaged T224743: cloudservices: unify puppet roles/profiles as Normal priority.
Fri, May 31, 4:57 PM · cloud-services-team (Kanban)
aborrero created T224743: cloudservices: unify puppet roles/profiles.
Fri, May 31, 4:57 PM · cloud-services-team (Kanban)
aborrero closed T224354: backport pdns-server version 3.x to Stretch as Resolved.

This should be fixed AFAIK. Closing task now. Hopefully, we will upgrade openstack soon and we will drop this hack.

Fri, May 31, 4:45 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T224354: backport pdns-server version 3.x to Stretch, a subtask of T221769: Upgrade cloudservices1003/1004 to stretch/mitaka, as Resolved.
Fri, May 31, 4:45 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T219362: Toolforge: cleanup unused/old puppet code as Resolved.

Most of the unused toolforge code has been cleaned up already. We still have plenty of code in the toollabs namespace, but that code is in use, mostly by k8s and other pieces that weren't upgraded as part of the GE->SGE migration.

Fri, May 31, 4:20 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero closed T219362: Toolforge: cleanup unused/old puppet code, a subtask of T208843: WMCS - Remove unused legacy code, as Resolved.
Fri, May 31, 4:20 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero added a comment to T224688: Outstanding icinga critical on cloudcontrol-dev hosts.

Sure! Adding some context for @JHedden:

Fri, May 31, 9:28 AM · cloud-services-team (Kanban), Cloud-Services

Thu, May 30

aborrero lowered the priority of T224558: sssd: support for Debian Jessie from High to Normal.

current status by the time of this comment: The only toolforge Debian Jessie server that is running sssd is tools-worker-1029, that was created explicitly for testing it.

Thu, May 30, 3:30 PM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero closed T224651: Manual update - stale file handle as Resolved.

I had to cordon/drain all the k8s worker nodes that were running sssd, because your pod was scheduled in them :-P

Thu, May 30, 12:26 PM · cloud-services-team (Kanban), Toolforge, Tool-inteGraality
aborrero added a comment to T224651: Manual update - stale file handle.

More tests:

Thu, May 30, 12:19 PM · cloud-services-team (Kanban), Toolforge, Tool-inteGraality
aborrero claimed T224651: Manual update - stale file handle.
Thu, May 30, 12:16 PM · cloud-services-team (Kanban), Toolforge, Tool-inteGraality
aborrero added a comment to T224651: Manual update - stale file handle.

However, I just did:

Thu, May 30, 12:16 PM · cloud-services-team (Kanban), Toolforge, Tool-inteGraality
aborrero added a comment to T224651: Manual update - stale file handle.

This pod is running in a node I'm working right now for T224558: sssd: support for Debian Jessie

Thu, May 30, 12:15 PM · cloud-services-team (Kanban), Toolforge, Tool-inteGraality
aborrero added a comment to T224558: sssd: support for Debian Jessie.

The pam issue may require a pam-auth-update --force --package run in the server, because there are stale entries in the pam config pointing to pam_ldap.so, which we don't use anymore after switching to sssd.

Thu, May 30, 12:01 PM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero added a comment to T224558: sssd: support for Debian Jessie.

Issues continue despite the patch https://gerrit.wikimedia.org/r/513091.

Thu, May 30, 11:38 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero added a comment to T224558: sssd: support for Debian Jessie.
profile::ldap::client::labs::client_stack: sssd
sudo_flavor: sudo
Thu, May 30, 10:29 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge

Wed, May 29

aborrero added a comment to T224528: rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet.

On IRC:

Wed, May 29, 5:08 PM · Cloud-Services, Operations, ops-codfw
aborrero renamed T224528: rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet from rack/setup codfw: cloudbackup2001.wikimedia.org and cloudbackup2002.wikimedia.org to rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet.
Wed, May 29, 4:47 PM · Cloud-Services, Operations, ops-codfw
aborrero added a comment to T224528: rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet.

On second thoughts, we would like to change the public VLAN for a private one, from .wikimedia.org to .wmnet.

Wed, May 29, 4:32 PM · Cloud-Services, Operations, ops-codfw
aborrero updated subscribers of T224528: rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet.
Wed, May 29, 3:22 PM · Cloud-Services, Operations, ops-codfw
aborrero claimed T224558: sssd: support for Debian Jessie.
Wed, May 29, 11:11 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero created T224558: sssd: support for Debian Jessie.
Wed, May 29, 11:11 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero closed T223067: sudo is still broken on certain toolforge hosts as Resolved.

This should be working now:

Wed, May 29, 11:09 AM · cloud-services-team (Kanban), Toolforge
aborrero closed T223067: sudo is still broken on certain toolforge hosts, a subtask of T221225: sssd integration needs to be updated to include sudo config from LDAP support, as Resolved.
Wed, May 29, 11:09 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero closed T221225: sssd integration needs to be updated to include sudo config from LDAP support as Resolved.

Well. Closing this task now, since we do have sudo support now. Will open other tasks for the other issues we have.

Wed, May 29, 11:07 AM · cloud-services-team (Kanban), Cloud-VPS, LDAP, Toolforge
aborrero closed T221225: sssd integration needs to be updated to include sudo config from LDAP support, a subtask of T217280: LDAP server running out of memory frequently and disrupting Cloud VPS clients, as Resolved.
Wed, May 29, 11:07 AM · cloud-services-team (Kanban), Patch-For-Review, Operations, Cloud-VPS, LDAP, Toolforge
aborrero renamed T224528: rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet from rack/setup codfw: cloudstore (backups) to rack/setup codfw: cloudbackup2001.wikimedia.org and cloudbackup2002.wikimedia.org.
Wed, May 29, 9:23 AM · Cloud-Services, Operations, ops-codfw
aborrero reassigned T224528: rack/setup codfw: cloudbackup2001.codfw.wmnet and cloudbackup2002.codfw.wmnet from Andrew to Papaul.

Rack proposal: anywhere in codfw, each server in a different rack, a rack with 10G support
Wiring configuration: single 10G connection each server if possible. The mgmt interface connected as in any other server (standard).
Name proposal: Per T210666#4916941, we agreed on calling them cloudbackup. They can be called cloudbackup2001.wikimedia.org and cloudbackup2002.wikimedia.org. I added the entry to https://wikitech.wikimedia.org/wiki/Infrastructure_naming_conventions#Servers

Wed, May 29, 9:17 AM · Cloud-Services, Operations, ops-codfw

Tue, May 28

aborrero assigned T224272: Request increased quota for sso Cloud VPS project to bd808.

+1 approved in the WMCS team meeting.

Tue, May 28, 4:40 PM · cloud-services-team (Kanban), Cloud-VPS (Quota-requests)
aborrero assigned T224057: Request increased quota for Automation Framework Cloud VPS project to bd808.
Tue, May 28, 4:40 PM · Cloud-VPS (Quota-requests)
aborrero added a comment to T224057: Request increased quota for Automation Framework Cloud VPS project.

+1 approved in the WMCS team meeting.

Tue, May 28, 4:40 PM · Cloud-VPS (Quota-requests)

Mon, May 27

aborrero added a comment to T210715: cloudvps: PDNS 3.x vs 4.x.

Related: T224354: backport pdns-server version 3.x to Stretch apparently we need pdns 3x for openstack mitaka.

Mon, May 27, 3:08 PM · cloud-services-team (Kanban)
aborrero added a comment to T224424: cloudservices1003: gateway timeout error.

The logs contains plenty of messages like these:

Mon, May 27, 12:33 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero added a comment to T224424: cloudservices1003: gateway timeout error.

Searching the logs for the concrete REQ, I see this:

Mon, May 27, 12:31 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero triaged T224424: cloudservices1003: gateway timeout error as High priority.
Mon, May 27, 12:28 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero created T224424: cloudservices1003: gateway timeout error.
Mon, May 27, 12:27 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero closed T224345: stretch build of prometheus-openstack-exporter incompatible with our mitaka apt repo as Resolved.

Should be OK now.

Mon, May 27, 12:00 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero closed T224345: stretch build of prometheus-openstack-exporter incompatible with our mitaka apt repo, a subtask of T215605: cloudvps: missing packages in stretch for cloudcontrol servers, as Resolved.
Mon, May 27, 12:00 PM · Patch-For-Review, cloud-services-team (Kanban)
aborrero added a comment to T216753: Document ToolsDB failover process for Clouddb Admins.

For now, the doc is pretty good. Got help from the DBAs--also a reminder that we should be doing regular failover testing rather than things being quite so hard to do.

Mon, May 27, 9:37 AM · Data-Services, cloud-services-team (Kanban)
aborrero added a comment to T221770: Upgrade cloucontrol1003/1004 to stretch/mitaka.

cloudcontrol1003 is flapping its systemd degraded alert since 2019-05-25 21:46. The unit that fails is:

● designate_floating_ip_ptr_records_updater.service               loaded failed failed    Designate Floating IP PTR records updater
Mon, May 27, 9:05 AM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)

May 24 2019

aborrero added a comment to T223902: cloudcontrol: decide on FQDN for service endpoints.

Honestly, wikimediacloudservices.org seems overly long. Just reading that makes me feel lazy :-( If possible, I would use it only for redirecting www. to wikitech or whatever our landing page is :-P

May 24 2019, 4:59 PM · Traffic, Operations, Cloud-VPS, cloud-services-team (Kanban)
aborrero triaged T224272: Request increased quota for sso Cloud VPS project as Normal priority.
May 24 2019, 9:56 AM · cloud-services-team (Kanban), Cloud-VPS (Quota-requests)
aborrero added a comment to T224272: Request increased quota for sso Cloud VPS project.

How many floating IPs are you requesting? I guess is just 1?

May 24 2019, 9:56 AM · cloud-services-team (Kanban), Cloud-VPS (Quota-requests)
aborrero triaged T224273: Toolforge: develop new k8s cluster in toolsbeta as Normal priority.
May 24 2019, 9:54 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero created T224273: Toolforge: develop new k8s cluster in toolsbeta.
May 24 2019, 9:54 AM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero added a comment to T169287: etcd config depends on puppet certs, but puppet doesn't know.

But puppet doesn't run the agent like normal when it modifies a cert. It waits for signature, etc.?
However, now that you mention it, I think I was thinking about it wrong. Why not just subscribe to: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/etcd/manifests/ssl.pp#32 ?

Certs for puppet may be core to puppet and affect evaluation, but a File resource that happens to use the puppet cert as the source is something that can be watched and acted on when it changes (the puppet cert itself may be as well...but it somehow makes my head hurt).

May 24 2019, 9:38 AM · Kubernetes, Cloud-Services, Operations
aborrero added a comment to T223902: cloudcontrol: decide on FQDN for service endpoints.

Do these belong in wikimedia.org at all? It seems this has already been discussed, but I guess I lack some context.

May 24 2019, 9:09 AM · Traffic, Operations, Cloud-VPS, cloud-services-team (Kanban)

May 23 2019

aborrero added a comment to T215553: Figure out cert management for Toolforge kubernetes and make it clear in documents, etc. for the upgrade.

I've been testing generating puppet certs with SANs with no issues so far. Is just a hiera override and re-generate the certs. Steps, for the record:

May 23 2019, 3:36 PM · Toolforge, cloud-services-team (Kanban), Kubernetes
aborrero added a comment to T169287: etcd config depends on puppet certs, but puppet doesn't know.

I wonder if puppet subscribe => and/or notify => mechanisms would work in this case.

May 23 2019, 2:48 PM · Kubernetes, Cloud-Services, Operations