Page MenuHomePhabricator

tools-services: Migrate to Stretch
Closed, ResolvedPublic

Description

Research what is running on tools-services nodes and attempt to get them running on Stretch in the toolsbeta cluster.

Puppetize all loose ends.

Task from WMCS 2018 offsite meetings.

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 21 2018, 12:38 PM
GTirloni triaged this task as Medium priority.

From https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin#Services:

These are services that run off service manifests for each tool - currently just the webservicemonitor service. They're in warm standby requiring manual switchover. tools-services-01 and tools-service-02 both have the exact same code running, but only one of them is 'active' at a time. Which one is determined by the puppet role param role::labs::tools::services::active_host. Set that via [[1]] to the fqdn of the host that should be 'active' and run puppet on all the services hosts. This will start the services in appropriate hosts and stop them in the appropriate hosts. Since services should not have any internal state, they can be run from any host without having to switch back compulsorily.

Bigbrother also runs on this host, via upstart. The log file can be found in /var/log/upstart/bigbrother.log.

Relevant files in Puppet:

  • modules/role/manifests/toollabs/services.pp
  • modules/role/manifests/aptly/server.pp
  • modules/toollabs/manifests/services.pp
  • modules/toollabs/manifests/bigbrother.pp
  • modules/toollabs/manifests/updatetools.pp

Part of the Puppet configuration comes from OpenStack through the use of a prefix for tools-services hosts:

Role: role::toollabs::services
Parameters: 
  - active_host: 'tools-services-01.tools.eqiad.wmflabs'

@aborrero I've created toolsbeta-services-01 to serve as a testbed for trying this on Stretch.

Some points as discussed on IRC:

  • we can't really run the updatetools daemon in toolsbeta since we may have conflicts with the main one running on the actual toolforge.
  • The puppet path for the main role we will be using is modules/role/manifests/toolforge/services.pp
  • the services nodes contains the aptly repo wiht .deb packages. Those would need to migrate/be rebuild to stretch (by hand, the content of the repo is not in puppet)
  • We will be using https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Services (new page) to document all the new works, aiming to reduce load in the main toolforge admin docs

Possible next steps:

  • document what the updatetools daemon is doing (probably directly in the new Services admin subpage)
  • create initial puppet code in the new namespace (toolforge vs toollabs) and apply role to toolsbeta-services-01 via horizon
  • see what we can do with the aptly repo content

Change 469614 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: bootstrap service node puppet code

https://gerrit.wikimedia.org/r/469614

Change 469614 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: refactor/bootstrap service node puppet code

https://gerrit.wikimedia.org/r/469614

Change 470386 had a related patch set uploaded (by GTirloni; owner: GTirloni):
[operations/puppet@production] tools-services: Fix typo in updatetools service exec path

https://gerrit.wikimedia.org/r/470386

Change 470386 merged by GTirloni:
[operations/puppet@production] tools-services: Fix typo in updatetools service exec path

https://gerrit.wikimedia.org/r/470386

Change 470397 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: add missing grid base profile to services role

https://gerrit.wikimedia.org/r/470397

Change 470397 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: add missing grid base profile to services role

https://gerrit.wikimedia.org/r/470397

Bstorm added a subscriber: Bstorm.Oct 30 2018, 8:23 PM

Removing bigbrother from the next iteration of services would make sense. I might see if it is possible to put it on a bastion for now, but an alternative is to simply communicate that we are dropping it to these folks: P7717
There's a couple big names in there.

Change 470683 had a related patch set uploaded (by GTirloni; owner: GTirloni):
[operations/puppet@production] tools-services: Add updatetools_enabled key

https://gerrit.wikimedia.org/r/470683

Change 470683 merged by GTirloni:
[operations/puppet@production] tools-services: Add updatetools_enabled key

https://gerrit.wikimedia.org/r/470683

GTirloni renamed this task from tools-service: Document current services and try them on Stretch to tools-service: Migrate to Stretch.Nov 23 2018, 12:01 PM
GTirloni renamed this task from tools-service: Migrate to Stretch to tools-services: Migrate to Stretch.
GTirloni updated the task description. (Show Details)

@Bstorm @aborrero At some point I lost track of this work and what has already been done. Is there anything left to do in this task now that the new grid is in place?