Page MenuHomePhabricator

eqiad: 2 VMs for cloudbackup-dev
Closed, ResolvedPublic

Description

Cloud VPS Project Tested: ---
Site/Location: EQIAD
Number of systems: 2
Service: cloudbackup-dev
Networking Requirements: internal
Processor Requirements: 2
Memory: 4G
Disks: 5G for base operating system 20G in a separate drive for LVM to store cinder backups
Other Requirements:

rationale

We're developing a new backup solution in Cloud VPS for cinder volumes, that will unblock migration of NFS servers into the cloud realm (see parent task).

We have several other cloudbackup servers, including 2 in codfw:

  • cloudbackup2001.codfw.wmnet (200TB storage)
  • cloudbackup2002.codfw.wmnet (200TB storage)

These 2 new VMs (cloudbackupXXXX-dev.eqiad.wmnet) are meant to serve as the mirror of the cloudbackup servers on codfw.

In other words:

  • the openstack deployment eqiad1 will use cloudbackup servers in codfw to store backup data <-- not related to this ticket
  • the openstack deployment codfw1dev will use cloudbackup-dev servers in eqiad to store fake backup data (we only need like 20G storage to tests the system) <-- this ticket

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+26 -6
operations/puppetproduction+10 -1
operations/puppetproduction+1 -9
operations/puppetproduction+32 -0
operations/puppetproduction+30 -23
operations/puppetproduction+6 -1
operations/puppetproduction+19 -5
operations/puppetproduction+1 -1
operations/puppetproduction+2 -6
operations/puppetproduction+1 -0
operations/puppetproduction+3 -0
operations/puppetproduction+47 -0
labs/privatemaster+9 -4
labs/privatemaster+1 -1
labs/privatemaster+4 -0
labs/privatemaster+3 -1
labs/privatemaster+6 -0
operations/puppetproduction+3 -1
operations/puppetproduction+5 -0
operations/puppetproduction+5 -0
operations/puppetproduction+29 -0
Show related patches Customize query in gerrit

Event Timeline

Change 738376 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: introduce role for cloudbackup-dev

https://gerrit.wikimedia.org/r/738376

sounds reasonable to me.

As @MoritzMuehlenhoff pointed out on previous requests we should not go under a certain size for the OS partition. Assuming you don't mind if it's 10G or more instead of 5.

Second disk can be added and mounted after VMs have been created (but caveat: this will likely stop the VMs from booting until we login via console and manually adjust renumbered devices in /etc/fstab).

OK! We don't really care about the OS drive size. What's important here is the extra drive for LVM, which should have at least 20G.

You create the VMs or I do? I never did it before, but the docs seem nice.

OK! We don't really care about the OS drive size. What's important here is the extra drive for LVM, which should have at least 20G.

How long do you expect those to be around? Is this a short test or will they be around continously? If the latter, please use 20G for the OS partition, but regardless use at least 10G. 5G causes all kinds of needless disk space alerts (e.g. when new kernel updates are installed).

You create the VMs or I do? I never did it before, but the docs seem nice.

It's self-managed by anyone in SRE, but if you run into any issues, just ping the I channel on IRC :-)

The VM gets created with the sre.ganeti.makevm cookbook and you can add the second disk following https://wikitech.wikimedia.org/wiki/Ganeti#Adding_a_disk

aborrero triaged this task as Medium priority.

OK! We don't really care about the OS drive size. What's important here is the extra drive for LVM, which should have at least 20G.

How long do you expect those to be around? Is this a short test or will they be around continously? If the latter, please use 20G for the OS partition, but regardless use at least 10G. 5G causes all kinds of needless disk space alerts (e.g. when new kernel updates are installed).

Ok, makes sense.

We expect the VMs to be around continuously for as long as we use the same setup in eqiad1.

You create the VMs or I do? I never did it before, but the docs seem nice.

It's self-managed by anyone in SRE, but if you run into any issues, just ping the I channel on IRC :-)

The VM gets created with the sre.ganeti.makevm cookbook and you can add the second disk following https://wikitech.wikimedia.org/wiki/Ganeti#Adding_a_disk

Thanks! Will do soon.

Change 738376 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: introduce role for cloudbackup-dev

https://gerrit.wikimedia.org/r/738376

Change 739535 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup100X-dev: add insetup role

https://gerrit.wikimedia.org/r/739535

Change 739535 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup100X-dev: add insetup role

https://gerrit.wikimedia.org/r/739535

Created 1 VM with:

aborrero@cumin1001:~ $ sudo cookbook sre.ganeti.makevm eqiad_B cloudbackup1001-dev --vcpus 2 --memory 4

no errors in the run.

Change 739581 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup1001-dev: update DHCP config

https://gerrit.wikimedia.org/r/739581

Change 739581 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup1001-dev: update DHCP config

https://gerrit.wikimedia.org/r/739581

Change 739591 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] site.pp: enable proper role for cloudbackup1001-dev

https://gerrit.wikimedia.org/r/739591

Change 739591 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] site.pp: enable proper role for cloudbackup1001-dev

https://gerrit.wikimedia.org/r/739591

Change 739595 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[labs/private@master] hieradata: openstack: codfw1dev backups: add ldap_user_pass secret

https://gerrit.wikimedia.org/r/739595

Change 739595 merged by Arturo Borrero Gonzalez:

[labs/private@master] hieradata: openstack: codfw1dev backups: add ldap_user_pass secret

https://gerrit.wikimedia.org/r/739595

Change 739599 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: codfw1dev: hiera update for new backup servers

https://gerrit.wikimedia.org/r/739599

Change 739777 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[labs/private@master] hiera: cloud: update secrets for cinder-backups @ codfw1dev

https://gerrit.wikimedia.org/r/739777

Change 739777 merged by Arturo Borrero Gonzalez:

[labs/private@master] hiera: cloud: update secrets for cinder-backups @ codfw1dev

https://gerrit.wikimedia.org/r/739777

Change 739779 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[labs/private@master] hiera: cloud: add more secrets for cinder-backups @ codfw1dev

https://gerrit.wikimedia.org/r/739779

Change 739779 merged by Arturo Borrero Gonzalez:

[labs/private@master] hiera: cloud: add more secrets for cinder-backups @ codfw1dev

https://gerrit.wikimedia.org/r/739779

Change 739782 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[labs/private@master] hiera: cloud: fix ceph hiera key name for cinder-backups @ codfw1dev

https://gerrit.wikimedia.org/r/739782

Change 739782 merged by Arturo Borrero Gonzalez:

[labs/private@master] hiera: cloud: fix ceph hiera key name for cinder-backups @ codfw1dev

https://gerrit.wikimedia.org/r/739782

Change 739792 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[labs/private@master] hiera: cloudbackup1001-dev: relocate ceph auth config

https://gerrit.wikimedia.org/r/739792

Change 739792 merged by Arturo Borrero Gonzalez:

[labs/private@master] hiera: cloudbackup1001-dev: relocate ceph auth config

https://gerrit.wikimedia.org/r/739792

Change 739599 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: codfw1dev: hiera update for new backup servers

https://gerrit.wikimedia.org/r/739599

Change 739870 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup: introduce ceph rbd configuration

https://gerrit.wikimedia.org/r/739870

Change 739870 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup: introduce ceph rbd configuration

https://gerrit.wikimedia.org/r/739870

Change 739892 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup: introduce base profiles

https://gerrit.wikimedia.org/r/739892

Change 739892 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup: introduce base profiles

https://gerrit.wikimedia.org/r/739892

Change 739895 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup: specify administrative contact

https://gerrit.wikimedia.org/r/739895

Change 739895 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup: specify administrative contact

https://gerrit.wikimedia.org/r/739895

Change 739903 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup: require the lvm package

https://gerrit.wikimedia.org/r/739903

Change 739903 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup: require the lvm package

https://gerrit.wikimedia.org/r/739903

Change 739905 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudbackup: fix disk allocation

https://gerrit.wikimedia.org/r/739905

Change 739905 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudbackup: fix disk allocation

https://gerrit.wikimedia.org/r/739905

Change 740107 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: cinder-backups: refresh node references

https://gerrit.wikimedia.org/r/740107

Change 740107 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: cinder-backups: refresh node references

https://gerrit.wikimedia.org/r/740107

Change 740131 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: cinder-backup: refresh hiera config for eqiad1

https://gerrit.wikimedia.org/r/740131

Change 740131 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: cinder-backup: refresh hiera defaults for eqiad1

https://gerrit.wikimedia.org/r/740131

Change 740535 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: cinder-backups: fix configuration values

https://gerrit.wikimedia.org/r/740535

Change 740535 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: cinder-backups: fix configuration values

https://gerrit.wikimedia.org/r/740535

Change 740551 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: cinder-backups: use main ceph cinder keyring

https://gerrit.wikimedia.org/r/740551

Change 740820 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud: introduce cloudbackup1002-dev.eqiad.wmnet

https://gerrit.wikimedia.org/r/740820

Change 740820 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: introduce cloudbackup1002-dev.eqiad.wmnet

https://gerrit.wikimedia.org/r/740820

created second VM:

aborrero@cumin1001:~ $ sudo cookbook sre.ganeti.makevm eqiad_B cloudbackup1002-dev --vcpus 2 --memory 4
[..]
aborrero@ganeti1009:~ $ sudo gnt-instance modify --disk add:size=20g cloudbackup1002-dev
[..]
aborrero@ganeti1009:~ $ sudo gnt-instance startup cloudbackup1002-dev
Waiting for job 1461518 for cloudbackup1002-dev.eqiad.wmnet ...

etc.

Change 740551 abandoned by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud: cinder-backups: use main ceph cinder keyring

Reason:

a different patch was merged, see ed5658a51946148376ec19a6474d5e972bb34167

https://gerrit.wikimedia.org/r/740551

Change 745588 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloudbackup1002-dev: mark as spare for now

https://gerrit.wikimedia.org/r/745588

Change 745588 merged by Andrew Bogott:

[operations/puppet@production] cloudbackup1002-dev: mark as spare for now

https://gerrit.wikimedia.org/r/745588