Page MenuHomePhabricator

Convert labsdb1012 from multi-source to multi-instance
Open, HighPublic

Description

We are doing a re-arch of how wikireplicas work, in that process we are deprecating multi-source for many well known reasons.
labsdb1009-labsdb1011 will be replaced by clouddb hosts: T267090

labsdb1012 needs to be converted from multi-source to multi-instance.
@Bstorm has worked a lot to create some puppet roles for these new hosts, so I guess labsdb1012 will need to get one of them: https://gerrit.wikimedia.org/r/c/operations/puppet/+/639815

@elukey we are still far from having to tackle this (we are aiming for the end of March to have everything ready), but can you take a look from your side to see if there would be something else needed puppet-wise or it would be enough with what Brooke has already developed.

Note: labsdb1012 cannot be taken down during the first week of the month.

Details

ProjectBranchLines +/-Subject
operations/puppetproduction+0 -1
operations/puppetproduction+3 -3
operations/puppetproduction+2 -0
operations/puppetproduction+0 -5
operations/homer/publicmaster+3 -3
operations/puppetproduction+7 -7
operations/puppetproduction+2 -2
operations/puppetproduction+7 -7
operations/puppetproduction+2 -2
operations/puppetproduction+7 -7
operations/puppetproduction+2 -2
operations/puppetproduction+0 -1
operations/puppetproduction+7 -7
operations/puppetproduction+7 -7
operations/puppetproduction+7 -7
operations/puppetproduction+7 -7
operations/puppetproduction+10 -0
operations/puppetproduction+45 -20
operations/puppetproduction+5 -1
operations/puppetproduction+4 -0
operations/puppetproduction+1 -8
operations/puppetproduction+0 -3
operations/puppetproduction+14 -1
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@Marostegui I see in https://phabricator.wikimedia.org/T260441 that you handled the other hosts, should we just use the db.cfg partman config in https://gerrit.wikimedia.org/r/c/operations/puppet/+/661529 ? Any other thing to keep in mind?

Yes, that one should be fine. It will nuke everything but won't touch the raid level or anything else.
Even if you don't use a recipe, that should work. It will not format /srv but we can delete the data manually later on, before starting the transfer.

Up to you!

@elukey thanks for your comments; I edited the plan comment.

Mentioned in SAL (#wikimedia-operations) [2021-03-05T15:56:27Z] <razzi> stop mariadb on labsdb1012 to reimage and rename to clouddb1021: T269211

Mentioned in SAL (#wikimedia-analytics) [2021-03-05T16:08:35Z] <razzi> sudo cookbook sre.hosts.decommission labsdb1012.eqiad.wmnet -t T269211

cookbooks.sre.hosts.decommission executed by razzi@cumin1001 for hosts: labsdb1012.eqiad.wmnet

  • labsdb1012.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 663865 merged by Razzi:
[operations/puppet@production] Remove labsdb1012 from puppet in preparation for rename

https://gerrit.wikimedia.org/r/663865

Mentioned in SAL (#wikimedia-analytics) [2021-03-05T16:54:50Z] <razzi> sudo cookbook sre.dns.netbox -t T269211 "Reimage and rename labsdb1012 to clouddb1021"

Change 661529 merged by Razzi:
[operations/puppet@production] wikireplicas: Add basic configuration for clouddb1021

https://gerrit.wikimedia.org/r/661529

Mentioned in SAL (#wikimedia-analytics) [2021-03-05T17:07:53Z] <razzi> sudo -i wmf-auto-reimage-host -p T269211 clouddb1021.eqiad.wmnet --new

Script wmf-auto-reimage was launched by razzi on cumin1001.eqiad.wmnet for hosts:

clouddb1021.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202103051708_razzi_31992_clouddb1021_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['clouddb1021.eqiad.wmnet']

Of which those FAILED:

['clouddb1021.eqiad.wmnet']

Mentioned in SAL (#wikimedia-analytics) [2021-03-05T18:18:44Z] <razzi> sudo cookbook sre.dns.netbox -t T269211 "Move clouddb1021 to private vlan"

Script wmf-auto-reimage was launched by razzi on cumin1001.eqiad.wmnet for hosts:

clouddb1021.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202103051830_razzi_12804_clouddb1021_eqiad_wmnet.log.

Mentioned in SAL (#wikimedia-analytics) [2021-03-05T18:30:57Z] <razzi> run again sudo -i wmf-auto-reimage-host -p T269211 clouddb1021.eqiad.wmnet --new

Need to follow up on "Force PXE"

18:45:13 | clouddb1021.eqiad.wmnet | WARNING: unable to verify that BIOS boot parameters are back to normal, got:
Boot parameter version: 1
Boot parameter 5 is valid/unlocked
Boot parameter data: 0004000000
 Boot Flags :
   - Boot Flag Invalid
   - Options apply to only next boot
   - BIOS PC Compatible (legacy) boot
   - Boot Device Selector : Force PXE
   - Console Redirection control : System Default
   - BIOS verbosity : Console redirection occurs per BIOS configuration setting (default)
   - BIOS Mux Control Override : BIOS uses recommended setting of the mux at the end of POST

Completed auto-reimage of hosts:

['clouddb1021.eqiad.wmnet']

and were ALL successful.

@razzi is this host ready for getting data on it?

@razzi is this host ready for getting data on it?

@Marostegui we only quickly checked that the /srv partition was fine etc.. on Friday, we got some issues with VLANs that took a bit more than expected (labsdb1012 was in the cloud VLAN, we moved clouddb1021 to the private one as the other clouddb nodes). If everything is all right for you on the host please kick off the copy! The plan is to add today/tomorrow the new instances via puppet, should we wait for your copy?

It should be fine to add instances while I do the first copies. All that will happen is that puppet will attempt to create /srv/sqldata.sX and if they already exist (cause the transfer is started) it will just be skipped. If they are not there, puppet will create those empty directories.
I am going to start the transfer of the first two instances then (s1 and s3)

Change 669628 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] dbproxy1018: Depool clouddb1013

https://gerrit.wikimedia.org/r/669628

Change 669628 merged by Marostegui:
[operations/puppet@production] dbproxy1018: Depool clouddb1013

https://gerrit.wikimedia.org/r/669628

Change 669633 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] dbproxy1019: Depool clouddb1013

https://gerrit.wikimedia.org/r/669633

Change 669633 merged by Marostegui:
[operations/puppet@production] dbproxy1019: Depool clouddb1013

https://gerrit.wikimedia.org/r/669633

Mentioned in SAL (#wikimedia-operations) [2021-03-08T07:32:24Z] <marostegui> Depool clouddb1013:3311, clouddb1013:3313 - T269211

For the record I just ran:

root@clouddb1021:/srv# pvs
  PV         VG   Fmt  Attr PSize  PFree
  /dev/sda3  tank lvm2 a--  13.92t <4.83t
root@clouddb1021:/srv# lvextend -L+4800G /dev/mapper/tank-data
  Size of logical volume tank/data changed from 9.09 TiB (2384188 extents) to 13.78 TiB (3612988 extents).
  Logical volume tank/data successfully resized.
root@clouddb1021:/srv# xfs_growfs /srv
meta-data=/dev/mapper/tank-data  isize=512    agcount=32, agsize=76294016 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=2441408512, imaxpct=5
         =                       sunit=64     swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 2441408512 to 3699699712
root@clouddb1021:/srv# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs    14T   15G   14T   1% /srv
root@clouddb1021:/srv# pvs
  PV         VG   Fmt  Attr PSize  PFree
  /dev/sda3  tank lvm2 a--  13.92t <145.81g

Transfer from clouddb1013 (s1 and s3) to clouddb1021 is now on-going

Removed labsdb1012 from tendril and zarcillo

@elukey @razzi the data for s1 and s3 has been transferred and I have moved their data directories to their final location.
I am not going to copy more stuff until you've changed puppet and the host is ready to get mysql up and running, just in case. Once that is done and I have configured replication and checked that everything is fine, I will keep the data movement.
Let me know once I can proceed and start the daemon and configure replication.

Change 668494 merged by Razzi:
[operations/puppet@production] wikireplicas: give analytics_multiinstance role to clouddb1021

https://gerrit.wikimedia.org/r/668494

@Marostegui clouddb1021 has the analytics_multiinstance role applied, is configured to expect data on s1 and s3 only (https://gerrit.wikimedia.org/r/c/operations/puppet/+/668494/3/hieradata/hosts/clouddb1021.yaml) and is scheduled for downtime through EU morning, just in case :)

So mysql can be started and replication enabled for those sections. If you log in, you'll see an old welcome message / message-of-the-day, due to an unrelated issue: https://phabricator.wikimedia.org/T276868, so don't worry that it says the node is still insetup.

Thanks @razzi. I have ack'ed the alerts on icinga and I will start working with these sections.
If I successfully get them up and running today I will be adding two more and I will also merge a puppet patch with those ones.

Change 670030 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] dbproxy1019: Depool clouddb1014:3312, clouddb1014:3317

https://gerrit.wikimedia.org/r/670030

Change 670030 merged by Marostegui:
[operations/puppet@production] dbproxy1019: Depool clouddb1014:3312, clouddb1014:3317

https://gerrit.wikimedia.org/r/670030

s1 and s3 are now replicating on clouddb1021 (pending enabling GTID - will do it once replication is in sync). Host added to tendril and to zarcillo.
s2 and s4 are being transferred from clouddb1014.

 @razzi I have found some issues:

  • cloudb1021:3311 and 3313 (and I assume the rest of ports) aren't accesible from cumin1001 so I reckon there must be a FW rule somewhere.
  • clouddb1021 has an IPV6 dns which will probably trigger some grant errors (to be finally checked once the host is accessible from cumin1001) but likely will run into: T270101. The workaround for that is to remove that entry from netbox. But let's fix the FW issue first and then we can see if we run into this one.

Changed clouddb1021 from planned to active on netbox.

Change 670034 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Do not reimage clouddb1021

https://gerrit.wikimedia.org/r/670034

Change 670046 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] clouddb1021: Change buffer pool sizes

https://gerrit.wikimedia.org/r/670046

Change 670046 merged by Marostegui:
[operations/puppet@production] clouddb1021: Change buffer pool sizes

https://gerrit.wikimedia.org/r/670046

I have adjusted a bit the buffer pool sizes. They might need further changing but we'll only know once the sqoops run.

Change 670034 merged by Marostegui:
[operations/puppet@production] install_server: Do not reimage clouddb1021

https://gerrit.wikimedia.org/r/670034

Change 670088 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] clouddb1021: Add s2 and s7 to clouddb1021

https://gerrit.wikimedia.org/r/670088

Change 670088 merged by Marostegui:
[operations/puppet@production] clouddb1021: Add s2 and s7 to clouddb1021

https://gerrit.wikimedia.org/r/670088

Change 670092 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] dbproxy1019: Depool clouddb1015:3314, clouddb1015:3316

https://gerrit.wikimedia.org/r/670092

Change 670092 merged by Marostegui:
[operations/puppet@production] dbproxy1019: Depool clouddb1015:3314, clouddb1015:3316

https://gerrit.wikimedia.org/r/670092

Change 670100 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] clouddb1021: Add s4 and s6

https://gerrit.wikimedia.org/r/670100

Change 670100 merged by Marostegui:
[operations/puppet@production] clouddb1021: Add s4 and s6

https://gerrit.wikimedia.org/r/670100

Change 670342 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] dbproxy1019: Depool clouddb1016 (s5 and s8)

https://gerrit.wikimedia.org/r/670342

Change 670342 merged by Marostegui:
[operations/puppet@production] dbproxy1019: Depool clouddb1016 (s5 and s8)

https://gerrit.wikimedia.org/r/670342

Change 670417 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] clouddb1021: Enable s5 and s8

https://gerrit.wikimedia.org/r/670417

Change 670417 merged by Marostegui:
[operations/puppet@production] clouddb1021: Enable s5 and s8

https://gerrit.wikimedia.org/r/670417

s5 and s8 are now up and replicating

Marostegui moved this task from In progress to Done on the DBA board.

All the sections have been started and are now in sync with their masters.
I have run a check private data to make sure everything is ok (it was copied from other clouddb* hosts, but just in case).
Data-Persistence part is done, next would be for cloud-services-team to create the views on all the wikis + centralauth database (which lives in s7).
This host has notifications disabled, so once Analytics feels it is ready, they should be enabled.

@razzi can you follow up with @Bstorm about the next steps? :)

Sounds good @elukey.

Thanks for your speedy data population @Marostegui! Responding to the firewall issue you raised above, I was able to connect to clouddb1021 from cumin1001 from all ports; let me know if you're still running into issues. I was also able to resolve an IPV4 address of clouddb1021 from cumin1001 via dig, so I'm not sure if IPV6 will be an issue.

@Bstorm do you have what you need to create all the views? Let me know how I can be helpful with this part!

@razzi yeah, it was all fixed by removing the DNS IPv6 record. Nothing else required.

I went ahead and refreshed the view definitions on the host because there have been a few changes during this process, just to make sure they are up to date. I think it's probably all set to go now.

Change 672797 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/homer/public@master] Replace labsdb1012 with clouddb1021 in analytics-in4

https://gerrit.wikimedia.org/r/672797

Change 672797 merged by Razzi:
[operations/homer/public@master] Replace labsdb1012 with clouddb1021 in analytics-in4

https://gerrit.wikimedia.org/r/672797

Change 674097 had a related patch set uploaded (by Razzi; owner: Razzi):
[operations/puppet@production] refinery: Rename --labsdb flag to be --clouddb

https://gerrit.wikimedia.org/r/674097

Change 674182 had a related patch set uploaded (by Razzi; owner: Razzi):
[operations/puppet@production] site: remove decommissioned node labsdb1012

https://gerrit.wikimedia.org/r/674182

Change 674182 merged by Razzi:
[operations/puppet@production] site: remove decommissioned node labsdb1012

https://gerrit.wikimedia.org/r/674182

Change 674194 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] wiki-replicas.sql: Add analytics user

https://gerrit.wikimedia.org/r/674194

Change 674194 merged by Marostegui:
[operations/puppet@production] wiki-replicas.sql: Add analytics user

https://gerrit.wikimedia.org/r/674194

Change 674097 merged by Razzi:
[operations/puppet@production] refinery: Rename --labsdb flag to be --clouddb

https://gerrit.wikimedia.org/r/674097

How did the first sqoop run go?

Looks like it took about 4 hours longer than the previous run (44 instead of 40), which is totally fine. And no errors or glitches I can see, so all good. Thanks very much for your help!

That's excellent, are we good to close this?

Almost! There are a couple of things left:

  • clouddb1021 is still running with icinga notifications disabled, plus there is a WARNING related to Mariadb Memory usage that needs to be tweaked for our use case (since we basically use all the RAM available). @razzi can you check when you have a moment?
  • @JAllemandou may have some performance questions to add related to indexes IIRC, leaving a note in here to remember to check with him :)
  • @JAllemandou may have some performance questions to add related to indexes IIRC, leaving a note in here to remember to check with him :)

He opened T279095 for the performance issue, so we can work on that there (to untangle this task)

Almost! There are a couple of things left:

  • clouddb1021 is still running with icinga notifications disabled, plus there is a WARNING related to Mariadb Memory usage that needs to be tweaked for our use case (since we basically use all the RAM available). @razzi can you check when you have a moment?

I just realised that the check isn't even realistic as we have 8 mysqld processes, not just one.

@Marostegui could you expand on why the check isn't realistic? From what I can tell all it's monitoring is the total used memory, which shouldn't be affected by the number of mysqld processes.

The sections allocated memory is 70G + 40 + 40 + 70 + 40 + 30 + 50 + 70 = 410G, which should in theory be plenty smaller than the 503G the machine has, but I notice some sections are taking significantly more memory than they're supposed to:

razzi@clouddb1021:~$ top -c -b -o +%MEM | head -n 20
...
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 4992 mysql     20   0   82.3g  75.6g  15964 S  11.8  15.0  11309:25 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s4 +
 7941 mysql     20   0   81.2g  75.4g  17116 S   5.9  15.0  10059:11 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s1 +
27870 mysql     20   0   81.7g  74.6g  15548 S  11.8  14.8  12397:55 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s8 +
19974 mysql     20   0   77.0g  71.0g  15996 S  17.6  14.1   7064:00 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s3 +
29117 mysql     20   0   60.5g  54.0g  15580 S   5.9  10.7   4994:21 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s7 +
28982 mysql     20   0   49.4g  43.2g  15672 S   5.9   8.6   5951:29 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s2 +
27794 mysql     20   0   49.5g  42.9g  16920 S   5.9   8.5   3054:12 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s5 +
 4844 mysql     20   0   38.9g  32.4g  16156 S   5.9   6.4   4857:51 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s6 +

For example s3 is supposed to have 40G but its resident memory is a full 71.0g. Is that suspiciously high?

@Marostegui could you expand on why the check isn't realistic? From what I can tell all it's monitoring is the total used memory, which shouldn't be affected by the number of mysqld processes.

We should probably rename the check to make it less confusing. Something to make it obvious it is the overall memory used by mariadb and not a single process.

The sections allocated memory is 70G + 40 + 40 + 70 + 40 + 30 + 50 + 70 = 410G, which should in theory be plenty smaller than the 503G the machine has, but I notice some sections are taking significantly more memory than they're supposed to:

razzi@clouddb1021:~$ top -c -b -o +%MEM | head -n 20
...
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 4992 mysql     20   0   82.3g  75.6g  15964 S  11.8  15.0  11309:25 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s4 +
 7941 mysql     20   0   81.2g  75.4g  17116 S   5.9  15.0  10059:11 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s1 +
27870 mysql     20   0   81.7g  74.6g  15548 S  11.8  14.8  12397:55 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s8 +
19974 mysql     20   0   77.0g  71.0g  15996 S  17.6  14.1   7064:00 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s3 +
29117 mysql     20   0   60.5g  54.0g  15580 S   5.9  10.7   4994:21 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s7 +
28982 mysql     20   0   49.4g  43.2g  15672 S   5.9   8.6   5951:29 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s2 +
27794 mysql     20   0   49.5g  42.9g  16920 S   5.9   8.5   3054:12 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s5 +
 4844 mysql     20   0   38.9g  32.4g  16156 S   5.9   6.4   4857:51 /opt/wmf-mariadb104/bin/mysqld --defaults-group-suffix=@s6 +

For example s3 is supposed to have 40G but its resident memory is a full 71.0g. Is that suspiciously high?

It is not really suspicious, mysql isn't supposed t have 40GB, the innodb buffer pool is, but mysql will be using more than that (other buffers, connections...). There's the wrong believe that all the memory that mysql is supposed to use is just the innodb buffer pool, but it really isn't.

Almost! There are a couple of things left:

  • clouddb1021 is still running with icinga notifications disabled, plus there is a WARNING related to Mariadb Memory usage that needs to be tweaked for our use case (since we basically use all the RAM available). @razzi can you check when you have a moment?

@razzi the host still has notifications disabled, so if mariadb goes down we don't get any alert. We should fix it before closing..

@elukey Good point. With regards to the memory warning, we can:

  • lower the memory allocated to the sections, but this would negatively impact performance
  • raise the threshold of memory alerting (but it's already warning at 90% and critical at 95%, so we don't have much room)
  • something else?

We have the following in hiera:

# clouddb1021
profile::base::notifications: disabled
[..]

That needs to be removed :)

Change 677977 had a related patch set uploaded (by Razzi; author: Razzi):

[operations/puppet@production] clouddb: enable alerting for clouddb1021

https://gerrit.wikimedia.org/r/677977