Page MenuHomePhabricator

Productionize pc1015, p1016, pc2015 and pc2016
Closed, ResolvedPublic

Description

  • pc1015 spare
  • pc1016
  • pc2015 spare
  • pc2016

Related Objects

Event Timeline

This needs to wait for the eqiad hosts to be ready as well.

Marostegui renamed this task from Productionize pc2015 and pc2016 to Productionize pc1015, p1016, pc2015 and pc2016.Aug 3 2023, 7:07 AM
Marostegui changed the task status from Open to Stalled.
Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to Blocked on the DBA board.
Marostegui updated the task description. (Show Details)

I expanded the lvs on the codfw hosts, as they are installed already

Marostegui changed the task status from Stalled to Open.Sep 27 2023, 8:44 AM
Marostegui added subscribers: ABran-WMF, Ladsgroup.

@Ladsgroup @ABran-WMF this can now go

Btw I just ran this:

[08:45:24] marostegui@cumin1001:~$  sudo cumin 'pc[1015-1016]'.eqiad.wmnet 'lvextend -L+1T /dev/mapper/tank-data ; xfs_growfs /srv'
2 hosts will be targeted:
pc[1015-1016].eqiad.wmnet
OK to proceed on 2 hosts? Enter the number of affected hosts to confirm or "q" to quit: 2
===== NODE GROUP =====
(2) pc[1015-1016].eqiad.wmnet
----- OUTPUT of 'lvextend -L+1T /... xfs_growfs /srv' -----
  Size of logical volume tank/data changed from <7.56 TiB (1981022 extents) to <8.56 TiB (2243166 extents).
  Logical volume tank/data successfully resized.
meta-data=/dev/mapper/tank-data  isize=512    agcount=32, agsize=63392704 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0
data     =                       bsize=4096   blocks=2028566528, imaxpct=5
         =                       sunit=64     swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 2028566528 to 2297001984
================
PASS |██████████████████████████████████████████████████████████████████████████████████████████| 100% (2/2) [00:00<00:00,  2.93hosts/s]
FAIL |                                                                                                  |   0% (0/2) [00:00<?, ?hosts/s]
100.0% (2/2) success ratio (>= 100.0% threshold) for command: 'lvextend -L+1T /... xfs_growfs /srv'.
100.0% (2/2) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.

Can we please go for MariaDB 10.6 on those hosts?
Adding mariadb::package: 'wmf-mariadb106' on their yaml files should be enough (pc1015.yaml etc)

I can take care of the mw part but I never have productionized a host before and have no clue how it's done :( any docs?

Can we please go for MariaDB 10.6 on those hosts?
Adding mariadb::package: 'wmf-mariadb106' on their yaml files should be enough (pc1015.yaml etc)

Do we have any other PC host on 10.6? It would make me much more comfortable to do so. Otherwise we can make the new spares 10.6 at least.

pc1 has been entirely running 10.6 for months now

Then sure, let's set all of them to 10.6

I can take care of the mw part but I never have productionized a host before and have no clue how it's done :( any docs?

No problem - I will do it with @ABran-WMF

Change 966329 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Productionize pc1016, pc2016

https://gerrit.wikimedia.org/r/966329

Change 966329 merged by Marostegui:

[operations/puppet@production] mariadb: Productionize pc1016, pc2016

https://gerrit.wikimedia.org/r/966329

I can take care of the mw part but I never have productionized a host before and have no clue how it's done :( any docs?

No problem - I will do it with @ABran-WMF

Thank you. You're the best.

This is all done.
The current setup is:

pc4.png (626×2 px, 122 KB)

Masters: pc2016 (primary), pc1016 (dc master). It will be the other way around once we are back in eqiad
Floating spares: pc2015, pc1015

@Ladsgroup notifications are disabled for all hosts, please enable them once they are ready to go live.
Grants have been replicated from the current parsercache clusters.
All hosts are available on prometheus as well.

Marostegui updated the task description. (Show Details)

oh awesome. I'll create the dbs and tables and then bring it online without bringing down everything (fingers crossed). Early next week.

One quick note: I think other PCs were co-masters between eqiad and codfw but not the case anymore.

I was wondering about that too when I set it up, I will make all of them co-masters now then!