Page MenuHomePhabricator

create mwdebug1003 - ganeti VM with buster and appserver role
Closed, ResolvedPublic

Description

We agreed that we want a new ganeti VM on buster, which uses the MW appserver role, similar to the previous testvm1001 but
named as a proper "mwdebug" server.

  • request VM, define needed CPU/RAM/disk (done in T268044, 4 CPU, 4GB RAM, 50 GB disk, copying existing mwdebug machines)
  • create VM (with insetup role)
  • add to DHCP, install the OS
  • create mcrouter cert (https://wikitech.wikimedia.org/wiki/Memcached_for_MediaWiki/mcrouter#Generate_certs_for_a_new_host)
  • apply the canary_appserver puppet role to it
  • add to conftool data under "testserver" section
  • ensure PHP72 APT component gets installed and PHP packages are installed
  • sync puppet compiler facts and add fake secrets in labs/private, confirm puppet changes can be compiled on mwdebug1003
  • check and list remaining puppet errors and missing packages
  • add in the WikimediaDebug extension (the extension repo itself and matching change in trafficserver LUA for x-debug header routing)

Work on this will start mid-November, from Nov 16 on.

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
Resolved toan
ResolvedLucas_Werkmeister_WMDE
ResolvedJoe
ResolvedJdforrester-WMF
ResolvedLadsgroup
InvalidNone
ResolvedReedy
OpenNone
Resolvedtstarling
ResolvedJdforrester-WMF
StalledNone
ResolvedNone
ResolvedPRODUCTION ERRORLegoktm
Resolvedtstarling
ResolvedJoe
ResolvedKrinkle
Resolvedhashar
ResolvedJdforrester-WMF
ResolvedDzahn
ResolvedDzahn
Resolved Gilles

Event Timeline

Dzahn triaged this task as High priority.Nov 17 2020, 5:22 PM
Dzahn updated the task description. (Show Details)

Just did a knowledge transfer session with Hugh and we created the VM together as part of that. As we talked about in our meeting 2 weeks ago.

VM now exists in site.pp with "insetup" role. Tomorrow we will have a second session like that where we go from that to the MW appserver role for which we need to
create the mcrouter cert first etc.

Change 641756 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: add canary appserver role on mwdebug1003

https://gerrit.wikimedia.org/r/641756

Change 641757 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] add mwdebug1003 to conftool-data

https://gerrit.wikimedia.org/r/641757

Change 641759 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] trafficserver: add mwdebug1003 to x-wikimedia-debug-routing map

https://gerrit.wikimedia.org/r/641759

Change 641760 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] DHCP: add mwdebug1003

https://gerrit.wikimedia.org/r/641760

Change 641760 merged by Dzahn:
[operations/puppet@production] DHCP: add mwdebug1003

https://gerrit.wikimedia.org/r/641760

Change 641802 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] add fake certificates for mwdebug1003

https://gerrit.wikimedia.org/r/641802

Change 641802 merged by Dzahn:
[labs/private@master] add fake certificates for mwdebug1003

https://gerrit.wikimedia.org/r/641802

Change 641756 merged by Dzahn:
[operations/puppet@production] site: add canary appserver role on mwdebug1003

https://gerrit.wikimedia.org/r/641756

Change 641835 had a related patch set uploaded (by Effie Mouzeli; owner: Effie Mouzeli):
[performance/WikimediaDebug@master] background.js: add mwdebug1003 in the list of servers

https://gerrit.wikimedia.org/r/641835

Change 641757 merged by Dzahn:
[operations/puppet@production] add mwdebug1003 to conftool-data

https://gerrit.wikimedia.org/r/641757

[cumin1001:~] $ sudo -i confctl select name=mwdebug1003.eqiad.wmnet get
{"mwdebug1003.eqiad.wmnet": {"weight": 0, "pooled": "inactive"}, "tags": "dc=eqiad,cluster=testserver,service=apache2"}

Mentioned in SAL (#wikimedia-operations) [2020-11-18T21:53:52Z] <mutante> mwdebug1003 - restarting ferm because config was generated but service not restarted due to puppet dependency errors, breaking NRPE monitoring T267248

Change 641835 merged by jenkins-bot:
[performance/WikimediaDebug@master] background.js: add mwdebug1003 in the list of servers

https://gerrit.wikimedia.org/r/641835

Change 641759 merged by Dzahn:
[operations/puppet@production] trafficserver: add mwdebug1003 to x-wikimedia-debug-routing map

https://gerrit.wikimedia.org/r/641759

Change 642567 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] mediawiki::php: allow opting-in to use the PHP72 component on buster

https://gerrit.wikimedia.org/r/642567

Change 642574 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[labs/private@master] fix the name of fake mcrouter cert for mwdebug1003

https://gerrit.wikimedia.org/r/642574

Change 642574 merged by Dzahn:
[labs/private@master] fix the name of fake mcrouter cert for mwdebug1003

https://gerrit.wikimedia.org/r/642574

Change 642567 merged by Dzahn:
[operations/puppet@production] mediawiki::php: allow opting-in to use the PHP72 component on buster

https://gerrit.wikimedia.org/r/642567

After the change above the PHP72 APT component has been added on mwdebug1003 and now we have the following PHP 7.2 packages installed by puppet on buster:

[mwdebug1003:~] $ dpkg -l | grep php
ii  php-common                           2:69+0~20190215163918.14+stretch~1.gbpfa617b+wmf1 all          Common files for PHP packages
ii  php7.2-bcmath                        7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        Bcmath module for PHP
ii  php7.2-bz2                           7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        bzip2 module for PHP
ii  php7.2-common                        7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        documentation, examples and common module for PHP
ii  php7.2-dba                           7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        DBA module for PHP
ii  php7.2-gd                            7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        GD module for PHP
ii  php7.2-gmp                           7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        GMP module for PHP
ii  php7.2-mbstring                      7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        MBSTRING module for PHP
ii  php7.2-mysql                         7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        MySQL module for PHP
ii  php7.2-opcache                       7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        Zend OpCache module for PHP
ii  php7.2-xml                           7.2.31-1+0~20200514.41+debian9~1.gbpe2a56b+wmf1   amd64        DOM, SimpleXML, WDDX, XML, and XSL module for PHP

Thanks to @MoritzMuehlenhoff for building them.

Dzahn updated the task description. (Show Details)
Dzahn updated the task description. (Show Details)

Change 643093 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] mediawiki::php: fix hardcoded stretch dist name for PHP 72 packages

https://gerrit.wikimedia.org/r/643093

Change 643093 merged by Dzahn:
[operations/puppet@production] mediawiki::php: fix hardcoded stretch dist name for PHP 72 packages

https://gerrit.wikimedia.org/r/643093

Mentioned in SAL (#wikimedia-operations) [2020-11-23T21:54:07Z] <mutante> mwdebug1003 - removing php packages and letting puppet reinstall them after it has the correct APT config T267248

Dzahn closed this task as Resolved.EditedNov 23 2020, 9:58 PM

VM has been created, added to Debug extension, added to conftool. Puppet role applied, puppet sets up APT component for buster, pulls correct PHP packages now.

Few remaining issues are part of T245757.

They are:

python-pil
python-imaging

Mentioned in SAL (#wikimedia-operations) [2021-04-16T23:47:28Z] <mutante> decom'ing mwdebug1003, stretch VM created in T267248

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mwdebug1003.eqiad.wmnet

  • mwdebug1003.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox

Change 680393 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] trafficserver: remove mwdebug1003 from x-wikimedia-debug-routing

https://gerrit.wikimedia.org/r/680393

Change 680393 merged by Dzahn:

[operations/puppet@production] trafficserver: remove mwdebug1003 from x-wikimedia-debug-routing

https://gerrit.wikimedia.org/r/680393

Change 681144 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/mediawiki-config@master] remove mwdebug1003 from list of debug servers

https://gerrit.wikimedia.org/r/681144

Change 681144 merged by jenkins-bot:

[operations/mediawiki-config@master] remove mwdebug1003 from list of debug servers

https://gerrit.wikimedia.org/r/681144

Mentioned in SAL (#wikimedia-operations) [2021-04-20T15:20:46Z] <urbanecm@deploy1002> Synchronized debug.json: dc6647b9c674429c0811116e0caca7639b766e77: remove mwdebug1003 from list of debug servers (T267248) (duration: 00m 57s)

Mentioned in SAL (#wikimedia-operations) [2021-04-20T15:21:59Z] <urbanecm@deploy1002> Synchronized docroot/noc/conf/debug.json: dc6647b9c674429c0811116e0caca7639b766e77: remove mwdebug1003 from list of debug servers (T267248) (duration: 00m 58s)

Change 685907 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] site: update comment on mwdebug servers, remove mwdebug1003

https://gerrit.wikimedia.org/r/685907

Change 685907 merged by Dzahn:

[operations/puppet@production] site: update comment on mwdebug servers, remove mwdebug1003

https://gerrit.wikimedia.org/r/685907