Page MenuHomePhabricator
Feed Advanced Search

Yesterday

Andrew updated the task description for T242455: Investigate options to improve CloudVPS backend database architecture .
Fri, Jul 10, 10:02 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)

Thu, Jul 9

Andrew assigned T257546: Request creation of "wmde-templates-alpha" VPS project to bd808.

approved -- someone will set this up in the next few days.

Thu, Jul 9, 3:48 PM · cloud-services-team (Kanban), WMDE-Templates-FocusArea, WMDE-Technical-Wishes-Team, Cloud-VPS (Project-requests)
Andrew assigned T257270: Request creation of mailman VPS project to bd808.

approved! Bryan will take care of this shortly

Thu, Jul 9, 3:47 PM · cloud-services-team (Kanban), Operations, Wikimedia-Mailing-lists, Cloud-VPS (Project-requests)

Wed, Jul 8

Andrew committed rLPRI02ad0bf3398a: Add more passwords for profile::openstack::<region>::galera::prometheus_db_pass (authored by Andrew).
Add more passwords for profile::openstack::<region>::galera::prometheus_db_pass
Wed, Jul 8, 8:50 PM
Andrew committed rLPRIaa10ac9f62b3: Add dummy passwords for profile::openstack::<region>::galera::prometheus_db_pass (authored by Andrew).
Add dummy passwords for profile::openstack::<region>::galera::prometheus_db_pass
Wed, Jul 8, 8:42 PM
Andrew added a comment to T257336: Request increased quota for wikidata-query Cloud VPS project.

This flavor was broken mostly because it asked for way too many cores (as well as slightly to much RAM). I've adjusted it as needed and things should be working better now.

Wed, Jul 8, 6:48 PM · cloud-services-team (Kanban), Wikidata, Wikidata-Query-Service, Discovery, Cloud-VPS (Quota-requests)
Andrew reopened T253836: Update quotas for MWoffliner VPS as "Open".

Do to a mistake in flavor config (my mistake) these VMs are running on hardware that was allocated for a different team. Is it possible for you to delete and recreate them one more time? In theory that should get them moved to the proper hardware.

Wed, Jul 8, 6:01 PM · Cloud-VPS (Quota-requests), affects-Kiwix-and-openZIM
Andrew added a comment to T257336: Request increased quota for wikidata-query Cloud VPS project.

I set aggregate_instance_extra_specs:wdqs='true' on that flavor; if you try recreating it should end up on the right hardware.

Wed, Jul 8, 5:29 PM · cloud-services-team (Kanban), Wikidata, Wikidata-Query-Service, Discovery, Cloud-VPS (Quota-requests)
Andrew closed T140391: Allow users to edit proxies as Resolved.
Wed, Jul 8, 3:40 AM · Patch-For-Review, cloud-services-team (Kanban), Horizon

Tue, Jul 7

Andrew created T257366: decom cloudvirt1015.
Tue, Jul 7, 8:33 PM · decommission-hardware, cloud-services-team (Kanban), Operations, ops-eqiad, DC-Ops, User-Zppix
Andrew added a comment to T237889: Install php-ldap on all MW appservers.

*bump*

Tue, Jul 7, 5:12 PM · serviceops, Operations, wikitech.wikimedia.org
Andrew added a comment to T208416: Check whether wikidata-dev project requires NFS or not.

Sounds good! lmk when everyone is ready.

Tue, Jul 7, 4:56 PM · User-Addshore, Patch-For-Review, Wikidata-Campsite, cloud-services-team (Kanban), wikidata-tech-focus, Wikidata, Cloud-VPS
Andrew closed T251294: Upgrade cloud-vps control plane to Debian Buster as Resolved.

I'm closing this for now. Updating the cloudweb/labweb boxes will be a lot easier after wikitech is moved elsewhere.

Tue, Jul 7, 4:55 PM · cloud-services-team (Kanban)
Andrew closed T257231: Galera: change managed service from mysql to mariadb, a subtask of T242455: Investigate options to improve CloudVPS backend database architecture , as Resolved.
Tue, Jul 7, 2:48 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew closed T257231: Galera: change managed service from mysql to mariadb as Resolved.
Tue, Jul 7, 2:48 PM · Cloud-VPS, cloud-services-team (Kanban)

Mon, Jul 6

Andrew added a comment to T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy.

For now I've implemented a simple solution: users can now create new proxies under .wmflabs.org or under .wmcloud.org. The default suggested domain is wmcloud.org.

Mon, Jul 6, 6:14 PM · Patch-For-Review, Horizon, cloud-services-team (Kanban)
Andrew created T257231: Galera: change managed service from mysql to mariadb.
Mon, Jul 6, 5:22 PM · Cloud-VPS, cloud-services-team (Kanban)
Andrew committed rLPRI9db53683a51a: Correct profile::openstack::codfw1dev::horizon::proxy_zone_passwords key name (authored by Andrew).
Correct profile::openstack::codfw1dev::horizon::proxy_zone_passwords key name
Mon, Jul 6, 1:31 AM
Andrew committed rLPRIb7e49f3ddb7d: Added fake passwords for proxy_zone_passwords (authored by Andrew).
Added fake passwords for proxy_zone_passwords
Mon, Jul 6, 1:10 AM

Sun, Jul 5

Andrew added a comment to T240979: Unable to create Web Proxy in the "phragile" Cloud VPS project (using Horizon).

Hm, sorry about those bug-related pings, I must've pasted the wrong bug number

Sun, Jul 5, 10:33 PM · Patch-For-Review, cloud-services-team (Kanban), Cloud-VPS

Wed, Jul 1

Andrew closed T256736: create a vps for Gryllida for wiki related development work as Declined.

@Gryllida We had our weekly review meeting and are going to close this request as 'denied' for now. Please do not take this as discouragement, though -- we would welcome a new request with a clearer scope. Specifically if it involves collaboration over a particular or extension and includes multiple named maintainers interested in the same work. In addition to the links that Bryan included here this page might also provide useful context: https://wikitech.wikimedia.org/wiki/Help:At_a_glance:_Cloud_VPS_and_Toolforge

Wed, Jul 1, 3:56 PM · Cloud-VPS (Project-requests)

Thu, Jun 25

Andrew reassigned T253267: Configure the soft anti-affinity (and presumably the soft affinity) server policy from Andrew to Bstorm.

@Bstorm try now? I did a quick test and it seems to be working (at least with 3 VMs it put them on three different hosts.)

Thu, Jun 25, 4:23 PM · Horizon, cloud-services-team (Kanban)
Andrew added a comment to T253267: Configure the soft anti-affinity (and presumably the soft affinity) server policy.

api versions are weird in nova... 2.1 is the version but there are later 'microversions' that can be requested specifically via http headers. We support up to microversion 2.65. It looks to me like horizon is properly requesting a higher microversion, so we shouldn't have API issues with using this feature.

Thu, Jun 25, 4:12 PM · Horizon, cloud-services-team (Kanban)
Andrew closed T254496: clean up old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate zones, a subtask of T243766: Cloud DNS: proposal for new DNS service names, as Resolved.
Thu, Jun 25, 2:43 PM · cloud-services-team (Kanban)
Andrew closed T254496: clean up old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate zones as Resolved.
Thu, Jun 25, 2:43 PM · cloud-services-team (Kanban)
Andrew closed T255764: Request creation of transferpy-test VPS project as Resolved.

I've made two VMs:

Thu, Jun 25, 2:40 PM · Cloud-VPS (Project-requests)
Andrew added a comment to T254496: clean up old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate zones.

Designate doesn't refresh ns records until there is otherwise activity on the zone. So the actual 'dig' results will update for domains gradually as they're nudged. I created and deleted a dummy under wmcloud.org and now I see:

Thu, Jun 25, 1:58 PM · cloud-services-team (Kanban)

Wed, Jun 24

Andrew added a comment to T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy.

For me, it is about making it easier for folks to convert over. Not all backends are full featured Apache2 or Nginx instances, so doing the redirect on the backend side is not necessarily trivial. It is reasonably easy on the urlproxy side as soon as we have an indicator to tell us when to do it.

Wed, Jun 24, 9:33 PM · Patch-For-Review, Horizon, cloud-services-team (Kanban)
Andrew added a comment to T255764: Request creation of transferpy-test VPS project.

This is approved. Since this is a short-term project, I'd like to put your storage on our (currently under-used) ceph cluster where there's lots of space to spare. Things will be slower there than local storage but if you're doing performance testing it's probably good to get you used to the triple-redundant future.

Wed, Jun 24, 9:16 PM · Cloud-VPS (Project-requests)
Andrew added a comment to T140391: Allow users to edit proxies.

This is a reasonable request from a usability standpoint. I haven't been regarding it as a priority because I'm /pretty sure/ that deleting the old proxy and creating a new one is requires only one more click -- are there more compelling reasons for this other than avoiding that click?

Wed, Jun 24, 8:45 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew added a comment to T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy.

Also -- is the purpose of supporting redirection so users only have to support one domain in their vhost? Or are there other reasons why it's important to have a redirect rather than just two different proxies to the same backend?

Wed, Jun 24, 8:14 PM · Patch-For-Review, Horizon, cloud-services-team (Kanban)
Andrew added a comment to T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy.

From a UI perspective, here are the steps I'm imagining:

Wed, Jun 24, 8:13 PM · Patch-For-Review, Horizon, cloud-services-team (Kanban)
Andrew created T256283: Practice Galera disaster recovert.
Wed, Jun 24, 4:40 PM · Cloud-VPS, cloud-services-team (Kanban)
Andrew updated the task description for T242455: Investigate options to improve CloudVPS backend database architecture .
Wed, Jun 24, 4:38 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew added a comment to T255730: Request increased quota for petscan Toolforge tool for database access.

Approved during weekly wmcs meeting

Wed, Jun 24, 3:38 PM · Data-Services (Quota-requests)
Andrew claimed T255764: Request creation of transferpy-test VPS project.
Wed, Jun 24, 3:37 PM · Cloud-VPS (Project-requests)
Andrew added a parent task for T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy: T256206: Switch codesearch to codesearch.wmcloud.org.
Wed, Jun 24, 3:21 PM · Patch-For-Review, Horizon, cloud-services-team (Kanban)
Andrew added a subtask for T256206: Switch codesearch to codesearch.wmcloud.org: T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy.
Wed, Jun 24, 3:21 PM · cloud-services-team (Kanban), VPS-project-codesearch
Andrew created T256276: Add support for managing wmcloud.org domain hosts in dynamic-proxy's domain proxy.
Wed, Jun 24, 3:20 PM · Patch-For-Review, Horizon, cloud-services-team (Kanban)
Andrew moved T255950: Should Striker's database be hosted on M5 or cloudcontrol/galera? from Inbox to Needs discussion on the cloud-services-team (Kanban) board.
Wed, Jun 24, 2:51 PM · Cloud-VPS, cloud-services-team (Kanban)
Andrew added a comment to T255950: Should Striker's database be hosted on M5 or cloudcontrol/galera?.

Thanks for checking, @jcrespo; answers in line:

Wed, Jun 24, 2:50 PM · Cloud-VPS, cloud-services-team (Kanban)
aborrero awarded T242455: Investigate options to improve CloudVPS backend database architecture a Party Time token.
Wed, Jun 24, 9:17 AM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)

Tue, Jun 23

Andrew updated the task description for T242455: Investigate options to improve CloudVPS backend database architecture .
Tue, Jun 23, 11:15 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew added a comment to T242455: Investigate options to improve CloudVPS backend database architecture .

In order to get something resembling a fresh start, I'm trying to copy data over to galera but creating new tables; that way we should avoid encoding issues and other cruft left over from having run OpenStack since the stone age.

Tue, Jun 23, 10:25 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew updated the task description for T242455: Investigate options to improve CloudVPS backend database architecture .
Tue, Jun 23, 10:22 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew added a comment to T255046: Request creation of "chat" VPS project (for a Mattermost instance).

As always with projects that will attract free-form outside users, we have several concerns/caveats:

  • Please be aware that maintenance and support of these new services will not be in any way the responsibility of the cloud-services team or WMF staff. If things break and your users complain to us we will send them to you.

I know and I accept the responsibility for now. Hope others come and help :)

Tue, Jun 23, 8:47 PM · Cloud-VPS (Project-requests)
Andrew added a comment to T243414: relocate/reimage cloudvirt1013 with 10G interfaces.

@jclark, typically we need to drain the workload from a host before we can swap it. It was empty when I opened this task but no longer, so this is blocked until we have a good way of draining it (and a good place to move the workload.)

Tue, Jun 23, 8:11 PM · cloud-services-team (Kanban), ops-eqiad, DC-Ops, Operations, Epic
Andrew closed T255950: Should Striker's database be hosted on M5 or cloudcontrol/galera?, a subtask of T242455: Investigate options to improve CloudVPS backend database architecture , as Resolved.
Tue, Jun 23, 8:05 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew closed T255950: Should Striker's database be hosted on M5 or cloudcontrol/galera? as Resolved.

Sounds good. I don't know if there's a long-term plan for miscellaneous DB hosting but for now I'll leave it be and we'll see if the DBAs chase us away later.

Tue, Jun 23, 8:05 PM · Cloud-VPS, cloud-services-team (Kanban)

Mon, Jun 22

Andrew committed rLPRI1627faf5f8a9: Dummy passwords for galera backup (authored by Andrew).
Dummy passwords for galera backup
Mon, Jun 22, 4:59 PM

Sun, Jun 21

Andrew closed T255806: Add haproxy ACLs for mysql access, a subtask of T242455: Investigate options to improve CloudVPS backend database architecture , as Invalid.
Sun, Jun 21, 6:30 PM · Patch-For-Review, Cloud-VPS, cloud-services-team (Kanban)
Andrew closed T255806: Add haproxy ACLs for mysql access as Invalid.

Thinking about this more, I don't think that haproxy can do anything that a firewall can't do -- it certainly doesn't know username/database for a mysql access so all it carn really do is enforce based on originating IP, and we already have a firewall to handle that.

Sun, Jun 21, 6:30 PM · Cloud-VPS, cloud-services-team (Kanban)
Andrew created T255950: Should Striker's database be hosted on M5 or cloudcontrol/galera?.
Sun, Jun 21, 5:53 PM · Cloud-VPS, cloud-services-team (Kanban)

Thu, Jun 18

Andrew created T255806: Add haproxy ACLs for mysql access.
Thu, Jun 18, 5:59 PM · Cloud-VPS, cloud-services-team (Kanban)
Andrew created T255787: Reconcile and/or understand differences between cloud-vps and prod hiera lookups.
Thu, Jun 18, 4:01 PM · Cloud-VPS, User-jbond, cloud-services-team (Kanban)
Andrew added a comment to T254786: Updating Scap on beta cluster hosts with cumin fails.

The short story here is: clustershell (and, hence, cumin) can't cope with different levels of zero-padding in hostnames. No short-term fix is coming for this, so I suggest rebuilding one or both of those hosts with a consistent naming scheme.

Thu, Jun 18, 3:21 PM · Beta-Cluster-Infrastructure
Andrew changed the status of T255780: cumin pattern-match fail with oddly named groups of hosts from Open to Stalled.
Thu, Jun 18, 3:20 PM · cloud-services-team (Kanban), SRE-tools, Beta-Cluster-Infrastructure
Andrew changed the status of T255780: cumin pattern-match fail with oddly named groups of hosts, a subtask of T254786: Updating Scap on beta cluster hosts with cumin fails, from Open to Stalled.
Thu, Jun 18, 3:20 PM · Beta-Cluster-Infrastructure
Andrew added a comment to T255780: cumin pattern-match fail with oddly named groups of hosts.

This is upstream issue https://github.com/cea-hpc/clustershell/issues/293

Thu, Jun 18, 3:19 PM · cloud-services-team (Kanban), SRE-tools, Beta-Cluster-Infrastructure
Andrew added a comment to T255780: cumin pattern-match fail with oddly named groups of hosts.

crap, it looks like this bug is in clustershell

Thu, Jun 18, 2:50 PM · cloud-services-team (Kanban), SRE-tools, Beta-Cluster-Infrastructure
Andrew created T255780: cumin pattern-match fail with oddly named groups of hosts.
Thu, Jun 18, 2:45 PM · cloud-services-team (Kanban), SRE-tools, Beta-Cluster-Infrastructure
Andrew added a comment to T254786: Updating Scap on beta cluster hosts with cumin fails.

This looks like it's something interesting! The project contains two logstash hosts: deployment-logstash2 and deployment-logstash03; cumin is generalizing that to deployment-logstash[02-03] which is obviously not right.

Thu, Jun 18, 2:42 PM · Beta-Cluster-Infrastructure
Andrew added a comment to T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory.

@wiki_willy honestly at this point the best outcome is probably getting 'store credit' towards future purchases. Having a replacement server (exactly like the old one) would be great but 1) I agree with @MoritzMuehlenhoff that having one new, oddball server in the middle of our cluster sounds bad, and 2) Any new servers that we're buying for this workload are 'thinvirts' which have radically different specs.

Thu, Jun 18, 2:30 PM · cloud-services-team (Kanban), Operations, ops-eqiad, DC-Ops, User-Zppix

Wed, Jun 17

Andrew added a comment to T255670: horizon: enable neutron port management.

@aborrero, I've enabled a few things on labtesthorizon.wikimedia.org:

Wed, Jun 17, 6:13 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew added a comment to T255046: Request creation of "chat" VPS project (for a Mattermost instance).

Oh, one other question: why 'chat' and not 'mattermost' if it's going to host mattermost?

Wed, Jun 17, 5:17 PM · Cloud-VPS (Project-requests)
Andrew added a comment to T255046: Request creation of "chat" VPS project (for a Mattermost instance).

As always with projects that will attract free-form outside users, we have several concerns/caveats:

Wed, Jun 17, 4:25 PM · Cloud-VPS (Project-requests)
Andrew reopened T193964: Request creation of matrix VPS project, a subtask of T193961: Set up Matrix.org homeserver on the Wikimedia Cloud VPS, as Open.
Wed, Jun 17, 4:18 PM · Matrix, MediaWiki-Stakeholders-Group
Andrew reopened T193964: Request creation of matrix VPS project as "Open".

Hello @Tgr. Discussion at https://phabricator.wikimedia.org/T255046#6212144 implies that this project is defunct; is that right? If so, I will close the project and reclaim resources.

Wed, Jun 17, 4:18 PM · Matrix, Cloud-VPS (Project-requests)
Andrew claimed T255046: Request creation of "chat" VPS project (for a Mattermost instance).
Wed, Jun 17, 3:46 PM · Cloud-VPS (Project-requests)
Andrew added a comment to T255670: horizon: enable neutron port management.

I'm pretty sure this is among the Horizon features that I marked out to avoid too many complex steps when launching VMs.

Wed, Jun 17, 2:58 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew claimed T253267: Configure the soft anti-affinity (and presumably the soft affinity) server policy.
Wed, Jun 17, 2:56 PM · Horizon, cloud-services-team (Kanban)

Tue, Jun 16

Andrew committed rLPRI9e1ccc045cf2: move fake galera passwords (authored by Andrew).
move fake galera passwords
Tue, Jun 16, 12:42 PM
Andrew committed rLPRIc43d47e98e7a: Dummy passwords for galera monitoring (authored by Andrew).
Dummy passwords for galera monitoring
Tue, Jun 16, 3:27 AM

Fri, Jun 12

Andrew closed T88450: Investigate / remove swap from labs instances as Resolved.

I did. And I just now checked a recent VM and it doesn't have a swap.

Fri, Jun 12, 2:02 PM · cloud-services-team (Kanban), Cloud-Services

Thu, Jun 11

Andrew added a comment to T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory.

@Andrew - are you ok if I forward them the kernel dump from P10788? Thanks, Willy

Thu, Jun 11, 10:24 PM · cloud-services-team (Kanban), Operations, ops-eqiad, DC-Ops, User-Zppix
Andrew reopened T238222: Request creation of BishopFox VPS project as "Open".

@Mfrostbfox, @Danbishopfox, it looks to me like either this process is finished or never happened. Is it OK if I delete this project and clean things up?

Thu, Jun 11, 3:17 PM · Cloud-VPS (Project-requests)

Jun 10 2020

Andrew added a comment to T252762: tools/toolsbeta: improve acme-chief integration.

acme-chief is set up and working in toolsbeta now. I haven't actually consumed any of the certs or thought about what certs we need (right now the server is just making one for toolsbeta.wmflabs.org)

Jun 10 2020, 5:05 PM · Acme-chief, cloud-services-team (Kanban)
bd808 awarded T251558: multilevel domains in the 'maps' project don't use tls a Love token.
Jun 10 2020, 4:36 PM · cloud-services-team (Kanban), Cloud-Services
Andrew closed T254496: clean up old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate zones as Resolved.

Backed up designate to /home/andrew/designate_backup-2020-06-10.sql

Jun 10 2020, 4:05 PM · cloud-services-team (Kanban)
Andrew added a comment to T254496: clean up old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate zones.

No ill effects (and some good ones) in traffic. Going to do this everywhere.

Jun 10 2020, 4:03 PM · cloud-services-team (Kanban)
Andrew moved T254496: clean up old cloud-ns0.wikimedia.org and cloud-ns1.wikimedia.org ns records in designate zones from Inbox to Needs discussion on the cloud-services-team (Kanban) board.
Jun 10 2020, 2:46 PM · cloud-services-team (Kanban)
Andrew added a project to T255023: Switch project-proxy to an LE cert for *.wmflabs.org: cloud-services-team (Kanban).
Jun 10 2020, 2:34 PM · cloud-services-team (Kanban)
Andrew created T255023: Switch project-proxy to an LE cert for *.wmflabs.org.
Jun 10 2020, 2:33 PM · cloud-services-team (Kanban)
Andrew closed T251558: multilevel domains in the 'maps' project don't use tls as Resolved.

These domains are now handled by a maps-proxy-01 and maps-proxy-02, and they have proper LE certs via acme-chief.

Jun 10 2020, 2:32 PM · cloud-services-team (Kanban), Cloud-Services
Andrew closed T252721: cloud-vps solution for Let's Encrypt, a subtask of T161256: multi-component wmflabs.org subdomains doesn't work under simple wildcard TLS cert, as Resolved.
Jun 10 2020, 1:47 PM · cloud-services-team (Kanban), Operations, Traffic, Maps, Cloud-VPS, DNS
Andrew closed T252721: cloud-vps solution for Let's Encrypt, a subtask of T251558: multilevel domains in the 'maps' project don't use tls, as Resolved.
Jun 10 2020, 1:47 PM · cloud-services-team (Kanban), Cloud-Services
Andrew closed T252721: cloud-vps solution for Let's Encrypt as Resolved.

I set up acme-chief in project-proxy using Krenair's guide here:

Jun 10 2020, 1:47 PM · cloud-services-team (Kanban), Cloud-VPS
Andrew closed T252721: cloud-vps solution for Let's Encrypt, a subtask of T252199: Stop using letsencrypt::cert::integrated, as Resolved.
Jun 10 2020, 1:47 PM · cloud-services-team (Kanban), Mail
Andrew closed T252721: cloud-vps solution for Let's Encrypt, a subtask of T252734: Consider moving tools away from acme-chief, as Resolved.
Jun 10 2020, 1:47 PM · cloud-services-team (Kanban), Tools

Jun 9 2020

Andrew added a comment to T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory.

Thanks @wiki_willy! I wouldn't love to decom that host, but if thinking about this is stealing DC-Ops's time away from racking our new hardware I definitely vote for the new stuff. There's no workload on 1015 now, so decom wouldn't make things any worse than they are now.

Jun 9 2020, 9:03 PM · cloud-services-team (Kanban), Operations, ops-eqiad, DC-Ops, User-Zppix
Andrew added a comment to T220853: VMs on cloudvirt1015 crashing - bad mainboard/memory.

@wiki_willy what should we do about this server? At this point going back to Dell feels like throwing good money after bad; they'll just put @Jclark-ctr on hold for half a day and then tell him to upgrade the firmware. Should we just unplug it and throw it in the garbage?

Jun 9 2020, 7:42 PM · cloud-services-team (Kanban), Operations, ops-eqiad, DC-Ops, User-Zppix
Andrew added a project to T254931: missing maps postgres passwords in clouddb-services: cloud-services-team (Kanban).
Jun 9 2020, 7:18 PM · Data-Services, Cloud-VPS, cloud-services-team (Kanban)
Andrew updated the task description for T254931: missing maps postgres passwords in clouddb-services.
Jun 9 2020, 7:08 PM · Data-Services, Cloud-VPS, cloud-services-team (Kanban)
Andrew created T254931: missing maps postgres passwords in clouddb-services.
Jun 9 2020, 7:07 PM · Data-Services, Cloud-VPS, cloud-services-team (Kanban)
Andrew closed T253780: Upgrade cloudservices nodes to Debian Buster, a subtask of T251294: Upgrade cloud-vps control plane to Debian Buster, as Resolved.
Jun 9 2020, 5:53 PM · cloud-services-team (Kanban)
Andrew closed T253780: Upgrade cloudservices nodes to Debian Buster as Resolved.
Jun 9 2020, 5:53 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew updated the task description for T251294: Upgrade cloud-vps control plane to Debian Buster.
Jun 9 2020, 5:50 PM · cloud-services-team (Kanban)
Andrew updated the task description for T253780: Upgrade cloudservices nodes to Debian Buster.
Jun 9 2020, 3:10 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew added a comment to T253780: Upgrade cloudservices nodes to Debian Buster.

I'm pretty sure we can get an adequate dump with just

Jun 9 2020, 2:07 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew added a comment to T254786: Updating Scap on beta cluster hosts with cumin fails.

@Volans is correct that I updated the cumin key vi scp. I did that because puppet (which typically would have updated cumin) is broken on that instance:

Jun 9 2020, 4:37 AM · Beta-Cluster-Infrastructure