Andrew (Andrew Bogott)
User

Projects (10)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Nov 2 2014, 11:35 PM (210 w, 22 h)
Availability
Available
IRC Nick
andrewbogott
LDAP User
Unknown
MediaWiki User
Andrewbogott [ Global Accounts ]

Recent Activity

Fri, Nov 9

Andrew closed T206224: WMCS: Fewer transitory middle-of-the-night puppet alerts as Resolved.

This should be fixed, thanks to Giuseppe's changes.

Fri, Nov 9, 2:59 PM · cloud-services-team (Kanban), Patch-For-Review, Operations

Wed, Nov 7

Andrew added a comment to T208885: Puppet errors on test-twemproxy project.

It's unlikely that moving the clients would break that. It's /possible/ that moving the master itself broke things but I haven't seen that happen before.

Wed, Nov 7, 12:55 AM · Cloud-VPS

Tue, Nov 6

Andrew added a comment to T208883: nova: add fullstack monitoring to eqiad1r.

This seems to work now but i haven't tested the icinga integration yet

Tue, Nov 6, 11:34 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew created T208883: nova: add fullstack monitoring to eqiad1r.
Tue, Nov 6, 7:49 PM · Patch-For-Review, cloud-services-team (Kanban)

Mon, Nov 5

Andrew triaged T208803: Migrate the Integration cloud project to eqiad1-r as Normal priority.
Mon, Nov 5, 9:39 PM · Release-Engineering-Team (Kanban), Epic, Cloud-Services
Andrew updated subscribers of T208101: Migrate deployment-prep to eqiad1.

In order to keep this ball rolling, I propose that we schedule this move for November 27th, 28th, and 29th. Any objections? We could try to cram it in the week before but then we'd run up against Thanksgiving if it takes longer than expected.

Mon, Nov 5, 9:25 PM · Beta-Cluster-Infrastructure, Epic, Cloud-Services
Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.

if cloudvirtanalyticsXXXX is really too long then let's go with cloudvirtdlXXXX

Mon, Nov 5, 7:17 PM · Analytics-Kanban, netops, Operations, Analytics
aborrero awarded T208754: rename cloudvirt1019 and cloudvirt1020 to cloudvirtdb1001 and cloudvirtdb1002 a Love token.
Mon, Nov 5, 6:14 PM · cloud-services-team (Kanban)
Andrew created T208754: rename cloudvirt1019 and cloudvirt1020 to cloudvirtdb1001 and cloudvirtdb1002.
Mon, Nov 5, 5:53 PM · cloud-services-team (Kanban)
Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.
Mon, Nov 5, 5:51 PM · Analytics-Kanban, netops, Operations, Analytics
Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.

@Cmjohnson, please name these 5 boxes 'cloudvirtanalyticsXXXX' starting with cloudvirtanalytics1001. And rack them in row B with normal cloudvirt cabling. (If need be I can figure out in better detail what I mean by 'cloudvirt cabling' but @ayounsi is probably the best to ask about that.)

Mon, Nov 5, 5:50 PM · Analytics-Kanban, netops, Operations, Analytics
Andrew added a comment to T201247: Sporadic puppet failures.

*bump* we're still getting these and my team is increasingly bleary and disoriented by all the middle-of-the-night pages. The most recent one was last night:

Mon, Nov 5, 4:32 PM · cloud-services-team (Kanban), Operations
Andrew created T208733: Rename labvirt1017 to cloudvirt1017, move to eqiad1.
Mon, Nov 5, 3:59 PM · Patch-For-Review, cloud-services-team (Kanban)

Sun, Nov 4

Andrew closed T208244: ntp broken in new region as Resolved.

I built a couple of ntp servers in the cloudinfra project and we pointed all VMs at those servers.

Sun, Nov 4, 6:04 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, netops, Cloud-VPS
Andrew closed T208244: ntp broken in new region, a subtask of T167293: Nova-network to Neutron migration, as Resolved.
Sun, Nov 4, 6:03 PM · Patch-For-Review, Epic, Cloud-Services

Fri, Nov 2

Andrew added a comment to T208599: Warn cloud users against re-using keys.

s/staff/production users/ ?

Fri, Nov 2, 6:55 PM · Striker, wikitech.wikimedia.org, cloud-services-team (Kanban)
Andrew updated the task description for T208599: Warn cloud users against re-using keys.
Fri, Nov 2, 6:55 PM · Striker, wikitech.wikimedia.org, cloud-services-team (Kanban)
kostajh awarded T208599: Warn cloud users against re-using keys a Love token.
Fri, Nov 2, 4:19 PM · Striker, wikitech.wikimedia.org, cloud-services-team (Kanban)
Andrew merged task T208600: Allow VPS projects to have a domain with the same name into T131367: Proxy corner case: proxy name foo.wmflabs.org == domain name foo.wmflabs.org.
Fri, Nov 2, 3:49 PM · Cloud-VPS
Andrew merged T208600: Allow VPS projects to have a domain with the same name into T131367: Proxy corner case: proxy name foo.wmflabs.org == domain name foo.wmflabs.org.
Fri, Nov 2, 3:49 PM · Patch-For-Review, Horizon
Andrew updated subscribers of T208599: Warn cloud users against re-using keys.
Fri, Nov 2, 3:31 PM · Striker, wikitech.wikimedia.org, cloud-services-team (Kanban)
Andrew created T208599: Warn cloud users against re-using keys.
Fri, Nov 2, 3:31 PM · Striker, wikitech.wikimedia.org, cloud-services-team (Kanban)
Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.

1/ Will all those hosts need to be in the same vlan/row (eg. cloud-hosts1-b-eqiad)? Ideally they should be spread across multiple rows to avoid the scenario of one row (aka. failure domain) outage taking the whole service down

Yeah, they should be spread out. Ideally between at least 3 rows.

Fri, Nov 2, 2:49 PM · Analytics-Kanban, netops, Operations, Analytics

Thu, Nov 1

Andrew added a comment to T206487: Request creation of eventmetrics VPS project.

Krenair is right, sorry

Thu, Nov 1, 9:19 PM · Community-Tech, Grant-Metrics, Cloud-VPS (Project-requests)
Andrew added a comment to T206487: Request creation of eventmetrics VPS project.

Yep, eventmetrics-prod01.eventmetrics.eqiad.wmflabs works for me. The 0.0.0.0/0 policy is a bit broad, you might want to reduce that to 172.16.0.0/21 for ssh.

Thu, Nov 1, 8:47 PM · Community-Tech, Grant-Metrics, Cloud-VPS (Project-requests)
Andrew closed Restricted Task, a subtask of T171786: Switch to new labs puppetmasters, as Resolved.
Thu, Nov 1, 6:24 PM · Patch-For-Review, cloud-services-team (Kanban), Operations, Cloud-VPS

Wed, Oct 31

Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.

A new project is fine.

Wed, Oct 31, 6:31 PM · Analytics-Kanban, netops, Operations, Analytics
Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.

Option one (easiest for cloud team):

Wed, Oct 31, 6:14 PM · Analytics-Kanban, netops, Operations, Analytics
Andrew added a comment to T207321: Figure out networking details for new cloud-analytics-eqiad Hadoop/Presto cluster.

Just had a great meeting with @chasemp, @faidon, @JAllemandou and @Nuria. The main action item (after Nuria had to go) was to talk with Cloud VPS engineers to see if we could make this cluster on Cloud Virts instead of bare metal in prod. That would be totally fine with us, and actually even preferred. I think we thought this was not possible originally, but if it is, and we can do it within a couple of weeks, we'd like to proceed that way.

So! @bd808 and @Andrew, what do you think? Our planned bare metal resource usage is:

  • 5 x workers: 128G RAM, 48T storage, 48 cores
  • 2 x hadoop masters: 16ishG RAM, 4ish cores, etc. This is flexible.
Wed, Oct 31, 4:06 PM · Analytics-Kanban, netops, Operations, Analytics
Andrew added a comment to T207677: Migrate 'Quarry' project to eqiad1.

@zhuyifei1999 or @Framawiki, can one of you announce this downtime to interested parties? Or at least rattle of a list of contacts here so I can do that?

Wed, Oct 31, 2:01 PM · Patch-For-Review, Quarry, cloud-services-team (Kanban), Cloud-Services
Andrew added a comment to T201247: Sporadic puppet failures.

@Volans, I don't have timestamps, but I do have this from our weekly meeting alert summary:

Wed, Oct 31, 4:29 AM · cloud-services-team (Kanban), Operations

Tue, Oct 30

Andrew claimed T208244: ntp broken in new region.
Tue, Oct 30, 8:15 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, netops, Cloud-VPS
Andrew moved T208244: ntp broken in new region from Inbox to Doing on the cloud-services-team (Kanban) board.
Tue, Oct 30, 8:15 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, netops, Cloud-VPS
Andrew added a project to T208244: ntp broken in new region: cloud-services-team (Kanban).
Tue, Oct 30, 8:14 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, netops, Cloud-VPS
Andrew added a comment to T196209: Associate Floating IP" button next to instance broken.

When you have a moment, could you retest this in eqiad1-r? I suspect that it has roughly the same failure cases as in eqiad, but the steps should make it clearer if/why it's failing.

Tue, Oct 30, 8:13 PM · cloud-services-team (Kanban), Horizon
Andrew added a comment to T201247: Sporadic puppet failures.

*bump* -- I'm interested on if anyone is working on fixing these issues. If not, that's fine but I'll put some more time into ensuring that we don't get pages for them :)

Tue, Oct 30, 7:35 PM · cloud-services-team (Kanban), Operations
Andrew added a comment to T208099: nova: can we expose the creator and virt host of VMs to the public?.

Associated upstream patch: https://review.openstack.org/#/c/614328/

Tue, Oct 30, 6:32 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew added a comment to T145703: Horizon loses credentials every day.

I just rolled out a new version of Horizon and (at a different time) restarted apache on both of the labweb boxes; in both cases my session persisted.

Tue, Oct 30, 6:16 PM · Cloud-Services, Horizon
Andrew closed T208099: nova: can we expose the creator and virt host of VMs to the public? as Resolved.

Now in Horizon I see "Host cloudvirt1018" in the server overview page. So I think this is done!

Tue, Oct 30, 3:20 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew added a comment to T41785: Create a Cloud VPS SMTP smarthost.

So how do we want to roll this out? Do it on a per-project basis while moving a project across regions? Just flip the big switch in hieradata/labs.yaml?

Tue, Oct 30, 3:25 AM · User-herron, Patch-For-Review, Operations, Cloud-Services, Mail
Andrew added a comment to T208099: nova: can we expose the creator and virt host of VMs to the public?.

It looks like user_id is already public; revealing the virt host may be just a policy change to extended_server_attributes; I'll make some tests.

Tue, Oct 30, 12:06 AM · Patch-For-Review, cloud-services-team (Kanban)

Mon, Oct 29

Andrew renamed T207101: delete t206636-3 VM and revert quota bumps for project wikidata-query from delete t206636 VM and revert quota bumps for project wikidata-query to delete t206636-3 VM and revert quota bumps for project wikidata-query.
Mon, Oct 29, 8:23 PM · Wikidata, cloud-services-team, Operations, Wikidata-Query-Service
Andrew added a comment to T208244: ntp broken in new region.

Running an ntp server or two on a cloud VM is probably not a big deal. But, before I go down that road... does anyone want to argue against us just using pool.ntp.org for VMs? And, what is the external source of ntp authority that the production NTP servers use?

Mon, Oct 29, 8:05 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, netops, Cloud-VPS
Andrew added a comment to T208244: ntp broken in new region.

I've attached patches that propose running a cloud-specific NTP server. I'd also be OK with changing the network ACLs to allow the new region to access the standard NTP servers (which were being used by the old region).

Mon, Oct 29, 5:51 PM · cloud-services-team (Kanban), Patch-For-Review, Operations, netops, Cloud-VPS

Fri, Oct 26

Andrew added a comment to T208101: Migrate deployment-prep to eqiad1.

@Krenair just rattled off a list of things we'll probably have to tweak by hand:

Fri, Oct 26, 10:20 PM · Beta-Cluster-Infrastructure, Epic, Cloud-Services
Andrew triaged T208101: Migrate deployment-prep to eqiad1 as Normal priority.
Fri, Oct 26, 10:20 PM · Beta-Cluster-Infrastructure, Epic, Cloud-Services
Andrew created T208099: nova: can we expose the creator and virt host of VMs to the public?.
Fri, Oct 26, 10:05 PM · Patch-For-Review, cloud-services-team (Kanban)
Andrew added a comment to T207677: Migrate 'Quarry' project to eqiad1.

How about noon CST on that Monday? (that's probably 17:00 UTC although that week is the week-of-timezone-slip so I can't make any promises)

Fri, Oct 26, 10:03 PM · Patch-For-Review, Quarry, cloud-services-team (Kanban), Cloud-Services
Andrew added a comment to T207677: Migrate 'Quarry' project to eqiad1.

OK, I'll go first :) How about if we schedule downtime for Monday the 5th?

Fri, Oct 26, 9:37 PM · Patch-For-Review, Quarry, cloud-services-team (Kanban), Cloud-Services
Andrew added a comment to T207715: Request increased quota for search Cloud VPS project.

Sorry, that last bug was attached in error.

Fri, Oct 26, 2:40 PM · cloud-services-team (Kanban), Patch-For-Review, Cloud-VPS (Quota-requests)

Thu, Oct 25

Andrew added a comment to T132880: tools.jembot PHP processes run out of memory and leave orphan php-cgi processes regularly.

I confess that I didn't have a super strong case for killing things just now; one of the labvirts was under strain and I saw several (maybe 4-5?) jembot processes running there and ran straight for the hatchet. It would be useful to know how many procs is a normal amount.

Thu, Oct 25, 8:26 PM · Tools
Andrew added a comment to T132880: tools.jembot PHP processes run out of memory and leave orphan php-cgi processes regularly.

I just now killed off all jembot processes and restarted again.

Thu, Oct 25, 8:17 PM · Tools
Andrew reassigned T191362: decom promethium/WMF3571 from Andrew to RobH.
Thu, Oct 25, 4:04 PM · decommission, Operations, DC-Ops, ops-eqiad
Andrew updated the task description for T191362: decom promethium/WMF3571.
Thu, Oct 25, 4:04 PM · decommission, Operations, DC-Ops, ops-eqiad
Andrew added a comment to T206636: Provide a way to have test servers on real hardware, isolated from production for Wikidata Query Service.

I've created a new VM, t206636-2.wikidata-query.eqiad.wmflabs

I see that t206636-2 is listed as having 4 VPU and 24G RAM. That sounds too small for what I'd need - is this accurate?

Thu, Oct 25, 2:53 PM · User-Smalyshev, Wikidata, cloud-services-team, Operations, Wikidata-Query-Service

Wed, Oct 24

Andrew added a comment to T145703: Horizon loses credentials every day.

I believe that this is happening but I don't think it has to do with load-balancing, at least directly. The session keys are held in a memcached pool that is shared between the two hosts. To verify (at least the most obvious case) I just tried this:

Wed, Oct 24, 8:26 PM · Cloud-Services, Horizon
Andrew added a comment to T206261: Routing RFC1918 private IP addresses to/from WMCS floating IPs.

Created T207859

Wed, Oct 24, 4:08 PM · Patch-For-Review, cloud-services-team (Kanban), User-herron, Operations, Cloud-Services, Mail
Andrew triaged T207859: DNS labsaliaser (mostly) no longer needed on Neutron as Normal priority.
Wed, Oct 24, 4:07 PM · cloud-services-team (Kanban)
Andrew added a project to T207859: DNS labsaliaser (mostly) no longer needed on Neutron: cloud-services-team (Kanban).
Wed, Oct 24, 4:07 PM · cloud-services-team (Kanban)
Andrew created T207859: DNS labsaliaser (mostly) no longer needed on Neutron.
Wed, Oct 24, 4:06 PM · cloud-services-team (Kanban)
Andrew added a comment to T206261: Routing RFC1918 private IP addresses to/from WMCS floating IPs.

Does this mean that we no longer need the IP aliaser in eqiad1-r?

Wed, Oct 24, 3:31 PM · Patch-For-Review, cloud-services-team (Kanban), User-herron, Operations, Cloud-Services, Mail
Andrew renamed T177959: Should VPS puppetmasters include labs-recursor0/ns-1 in their resolv.confs? from Should VPS puppetmasters include labs-ns0/ns-1 in their resolv.confs? to Should VPS puppetmasters include labs-recursor0/ns-1 in their resolv.confs?.
Wed, Oct 24, 3:29 PM · cloud-services-team (Kanban)
Andrew added a comment to T177959: Should VPS puppetmasters include labs-recursor0/ns-1 in their resolv.confs?.

Ah, ok. So it sounds this works! Do you have any concerns?

Wed, Oct 24, 3:29 PM · cloud-services-team (Kanban)
Andrew added a comment to T204551: cloudvps: phlogiston project trusty deprecation.

You should now be able to create new VMs in eqiad1-r. Let me know if you run into any trouble.

Wed, Oct 24, 3:28 PM · Patch-For-Review, Cloud-VPS (Ubuntu Trusty Deprecation), Phlogiston
Andrew added a comment to T177959: Should VPS puppetmasters include labs-recursor0/ns-1 in their resolv.confs?.

Hm, I vaguely think that we should always use the recursors rather than the auth in this case since we're generating IPs for use on a VM, so any IP-swizzling that we do in puppet should be the same as on the VM (which only knows about the recursors).

Wed, Oct 24, 3:06 PM · cloud-services-team (Kanban)
Andrew added a comment to T177959: Should VPS puppetmasters include labs-recursor0/ns-1 in their resolv.confs?.

If lab-ns* servers are down and labpuppetmaster can't resolve anything, what would be the impact?

Wed, Oct 24, 1:29 PM · cloud-services-team (Kanban)

Tue, Oct 23

Andrew added a comment to T207533: Move labs-recursors in WMCS.

Another issue is that we typically ssh via a bastion -- if the bastion is unable to resolve the target host then the connection will fail.

Tue, Oct 23, 8:23 PM · Patch-For-Review, Cloud-VPS, Operations
Andrew added a comment to T207663: Renumber cloud-instance-transport1-b-eqiad to public IPs.

There are currently 23 projects running in the new region, and we're moving more over every day. This would have been a reasonable request when were originally setting up the Neutron network but it is far from trivial now.

Tue, Oct 23, 5:40 PM · Patch-For-Review, netops, Cloud-Services, Operations
Andrew added a comment to T207677: Migrate 'Quarry' project to eqiad1.

Can I ask one of you to put up the maintenance message and suggest a window for this move? Anytime during US work hours (let's say after 14:00 UTC) will suit me. Thank you!

Tue, Oct 23, 5:24 PM · Patch-For-Review, Quarry, cloud-services-team (Kanban), Cloud-Services
Andrew added a comment to T207715: Request increased quota for search Cloud VPS project.

Since your VMs will have to be moved to the new region soon anyway, I suggest that you build these fresh instances over there (where you have plenty of quota anyway). That will save us a move later on.

Tue, Oct 23, 4:25 PM · cloud-services-team (Kanban), Patch-For-Review, Cloud-VPS (Quota-requests)
Andrew added a comment to T201247: Sporadic puppet failures.

Spoke too soon, got another failure overnight.

Tue, Oct 23, 2:18 PM · cloud-services-team (Kanban), Operations

Mon, Oct 22

Andrew triaged T207677: Migrate 'Quarry' project to eqiad1 as Normal priority.
Mon, Oct 22, 5:53 PM · Patch-For-Review, Quarry, cloud-services-team (Kanban), Cloud-Services
Andrew closed T207510: 'Detach Interface' option visible on Horizon as Resolved.

This should be fixed -- thanks for noticing @Paladox. Let me know if you find other things like this.

Mon, Oct 22, 5:10 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew triaged T207510: 'Detach Interface' option visible on Horizon as Normal priority.
Mon, Oct 22, 4:15 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew moved T207510: 'Detach Interface' option visible on Horizon from Inbox to Doing on the cloud-services-team (Kanban) board.
Mon, Oct 22, 4:04 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew added a comment to T206636: Provide a way to have test servers on real hardware, isolated from production for Wikidata Query Service.

I've created a new VM, t206636-2.wikidata-query.eqiad.wmflabs. This is in the older region, on a host that is not super busy but is supporting quite a few other VMs. If your tests look good there too then we're probably in good shape and can avoid needing special hardware just for you.

Mon, Oct 22, 3:57 PM · User-Smalyshev, Wikidata, cloud-services-team, Operations, Wikidata-Query-Service
Andrew added a comment to T201247: Sporadic puppet failures.

Things seem better this week! Is that my imagination?

Mon, Oct 22, 3:36 PM · cloud-services-team (Kanban), Operations

Sun, Oct 21

Andrew added a comment to T207533: Move labs-recursors in WMCS.

My only concern about this is that those recursors are used about every second on every VM, so they're a huge, vital point of failure and I'm a bit reluctant to rock the boat.

Sun, Oct 21, 2:16 PM · Patch-For-Review, Cloud-VPS, Operations
Andrew added a comment to T206636: Provide a way to have test servers on real hardware, isolated from production for Wikidata Query Service.

Thanks, Stas. There are two ways I think we can go forward with this:

Sun, Oct 21, 1:47 PM · User-Smalyshev, Wikidata, cloud-services-team, Operations, Wikidata-Query-Service

Fri, Oct 19

Andrew created T207510: 'Detach Interface' option visible on Horizon.
Fri, Oct 19, 7:43 PM · Patch-For-Review, cloud-services-team (Kanban), Horizon
Andrew added a comment to T203072: Request creation of ign2commons VPS project.

Oh, also, what would you like the VM to be named?

Fri, Oct 19, 6:38 PM · cloud-services-team (Kanban), Cloud-VPS (Project-requests)
Andrew added a comment to T203072: Request creation of ign2commons VPS project.

I can create a VM with a large disk allocation any time now. The default for requests like this would be a VM with 24Gb ram, a 300Gb disk and 4 cores. Will that work for you in the near-term?

Fri, Oct 19, 6:38 PM · cloud-services-team (Kanban), Cloud-VPS (Project-requests)

Thu, Oct 18

Andrew added a comment to T206636: Provide a way to have test servers on real hardware, isolated from production for Wikidata Query Service.

note to self, I can merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/468377/ after Stas releases this VM (or at least stops caring about resource contention)

Thu, Oct 18, 6:38 PM · User-Smalyshev, Wikidata, cloud-services-team, Operations, Wikidata-Query-Service
Andrew added a comment to T207327: Change switch config for cloudvirt1018 (formerly labvirt1018) to work with neutron/eqiad1.

Great! Thank you!

Thu, Oct 18, 4:01 PM · cloud-services-team (Kanban), Cloud-Services
Andrew added a comment to T207387: Puppet failures on trusty due to libmonitoring-plugin-perl.

@faidon, I'm not sure I understand your response here. We have an agreed-upon date for the removal of Trusty dependencies, and we are working as fast as we can to hit that date. You seem to have mentally moved that date into the past, which doesn't seem very realistic.

Thu, Oct 18, 3:39 PM · cloud-services-team
Andrew added a comment to T207327: Change switch config for cloudvirt1018 (formerly labvirt1018) to work with neutron/eqiad1.

VM networking does not work properly for this host, so something is still missing.

Thu, Oct 18, 3:02 PM · cloud-services-team (Kanban), Cloud-Services
Andrew added a comment to T207387: Puppet failures on trusty due to libmonitoring-plugin-perl.

shinken-01 is still active and needs to work for now. We're hoping to rebuild it but that's a work in progress.

Thu, Oct 18, 2:15 PM · cloud-services-team
Andrew added a comment to T207327: Change switch config for cloudvirt1018 (formerly labvirt1018) to work with neutron/eqiad1.

No real need to coordinate, you can just do it anytime -- I'll keep any real load off that host until after it's moved.

Thu, Oct 18, 1:43 PM · cloud-services-team (Kanban), Cloud-Services
Andrew created T207387: Puppet failures on trusty due to libmonitoring-plugin-perl.
Thu, Oct 18, 1:30 PM · cloud-services-team

Wed, Oct 17

Andrew added a comment to T41785: Create a Cloud VPS SMTP smarthost.

shinken-01.shinken.eqiad.wmflabs might be a good test.

Wed, Oct 17, 10:51 PM · User-herron, Patch-For-Review, Operations, Cloud-Services, Mail
Andrew updated subscribers of T207327: Change switch config for cloudvirt1018 (formerly labvirt1018) to work with neutron/eqiad1.
Wed, Oct 17, 8:29 PM · cloud-services-team (Kanban), Cloud-Services
Andrew created T207327: Change switch config for cloudvirt1018 (formerly labvirt1018) to work with neutron/eqiad1.
Wed, Oct 17, 8:25 PM · cloud-services-team (Kanban), Cloud-Services
Andrew closed T199125: rack/setup/install cloudvirt102[34] as Resolved.

Both these hosts are now up and running VMs.

Wed, Oct 17, 8:01 PM · cloud-services-team (Kanban), ops-eqiad, Cloud-VPS, Operations
Andrew triaged T207319: labvirt1018 -> cloudvirt1018: update physical label, network port description, netbox as Normal priority.
Wed, Oct 17, 7:29 PM · ops-eqiad, DC-Ops, Cloud-Services, Operations
Andrew created T207317: Rename labvirt1018 to cloudvirt1018, move to eqiad1.
Wed, Oct 17, 6:53 PM · Patch-For-Review, Cloud-Services
Andrew closed T199578: Designate (DNS) integration with Neutron as Resolved.

This is working for now. In a future version we can switch from sink to the neutron integration API.

Wed, Oct 17, 2:48 PM · Epic, Cloud-Services
Andrew closed T199578: Designate (DNS) integration with Neutron, a subtask of T167293: Nova-network to Neutron migration, as Resolved.
Wed, Oct 17, 2:48 PM · Patch-For-Review, Epic, Cloud-Services
Andrew closed T204374: Add port 22 to ferm for the cloud as Resolved.

I think this is done.

Wed, Oct 17, 2:47 PM · Cloud-Services
Andrew closed T206487: Request creation of eventmetrics VPS project as Resolved.

I've created this project. Make sure that you have Horizon switched to the 'eqiad1-r' region before you try to create things.

Wed, Oct 17, 2:36 PM · Grant-Metrics, Community-Tech, Cloud-VPS (Project-requests)

Tue, Oct 16

MusikAnimal awarded T205158: Mail relays needed for VMs in eqiad1 a Love token.
Tue, Oct 16, 7:44 PM · User-herron, Operations, Mail