Page MenuHomePhabricator

Deployment-prep hosts with puppet errors (tracking)
Closed, ResolvedPublic

Description

This is a Tracking-Neverending task for any puppet errors we might encounter on the beta cluster infrastructure. Puppet errors typically occurs when Puppet changes are made that are not necessarily taking in account beta (typically, class being renamed but not updated in wikitech, hiera parameters missing ...).

Related Objects

StatusAssignedTask
ResolvedEddieGP
ResolvedArielGlenn
ResolvedKrenair
ResolvedNone
ResolvedNone
Resolvedmobrovac
Resolvedmmodell
Resolvedmmodell
Resolvedhashar
DeclinedNone
Invalidmmodell
ResolvedEevans
DuplicateNone
Resolvedhashar
ResolvedKrenair
DuplicateNone
DeclinedNone
Resolvedbd808
Resolvedthcipriani
ResolvedNone
ResolvedNone
ResolvedNone
ResolvedNone
ResolvedOttomata
ResolvedLadsgroup
ResolvedEevans
Resolved AlexMonk-WMF
Resolvedakosiaris
Resolvedhashar
Resolvedhashar
ResolvedOttomata
ResolvedKrenair
ResolvedKrenair
Declinedhashar
Resolvedmmodell
ResolvedNone
ResolvedOttomata
ResolvedKrenair
ResolvedOttomata
Resolvedfgiunchedi
ResolvedKrenair
Resolvedmobrovac
Resolvedayounsi
DuplicateNone
ResolvedKrenair
OpenNone
Resolvedfgiunchedi
Resolvedmobrovac
ResolvedLadsgroup
ResolvedEddieGP
Resolved MarcoAurelio
ResolvedKrenair
DuplicateNone
ResolvedNone
Resolvedelukey
ResolvedNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
hashar added a subscriber: hashar.

Puppet failures related to Ganglia would be due to T134808 which have fixed cherry picked on beta puppet master but do not cover every cases.

hashar triaged this task as Normal priority.Jun 30 2016, 10:33 AM
hashar updated the task description. (Show Details)Sep 12 2016, 8:48 AM
demon added a subscriber: demon.Jan 5 2018, 11:23 PM

Is this really best as a tracking task or should we add it to the deployment-prep workboard column? The task by its nature is always gonna be open (or reopened).

It's fine with me if you want to move them all to a particular workboard column instead of a tracking task

-snapshot01 is T184270 (package it wants is missing from stretch, moritz to fix when higher priority things are done)

Andrew added a subscriber: Andrew.Mar 23 2018, 8:02 PM

As of today there are 16 shinken alerts (most puppet but at least one disk warning) on this project, and three VMs that are shut down but not deleted. All of this is viewable here: http://shinken.wmflabs.org/problems?search=deployment

deployment-videoscaler01 seems no longer exist?

$ ssh -a deployment-videoscaler01
channel 0: open failed: connect failed: No route to host / stdio forwarding failed
ssh_exchange_identification: Connection closed by remote host

@Andrew et al. Some docs on Wikitech on usual puppet errors and how to fix them would IMHO help. I feel some of us who has access to deployment-prep could help if we had some guidance. Also, IRC assistance would be great, if at all possible. Thanks.

hashar removed a subscriber: hashar.Mar 24 2018, 10:57 PM
EddieGP closed this task as Resolved.Mar 31 2018, 12:54 PM
EddieGP claimed this task.
EddieGP added a subscriber: EddieGP.
In T132259#3879429, demon wrote:

Is this really best as a tracking task or should we add it to the deployment-prep workboard column? The task by its nature is always gonna be open (or reopened).

In T132259#3879432, Krenair wrote:

It's fine with me if you want to move them all to a particular workboard column instead of a tracking task

I've now created an "Puppet errors" column on the Beta-Cluster-Infrastructure workboard and moved all open subtasks to that column.