puppet failure on deployment-phab01: Service[ssh-phab] refuses to start
Closed, ResolvedPublic

Description

Notice: /Stage[main]/Phabricator::Vcs/Base::Service_unit[ssh-phab]/Service[ssh-phab]/ensure: ensure changed 'stopped' to 'running'

From journalctl:

May 05 14:48:27 deployment-phab01 sshd[30653]: error: Bind to port 22 on 10.68.18.216 failed: Address already in use.
May 05 14:48:27 deployment-phab01 sshd[30653]: fatal: Cannot bind any address.

Seems it listens on a secondary IP address and might rely on LVS. Anyway that does not work on labs??

The tasks was originally filled for some puppet failures:

Both deployment-phab01 and deployment-phab02 fail with:

Error: Could not retrieve catalog from remote server: Error 400 on SERVER:
[{"environment"=>"www", "owner"=>"root", "group"=>"www-data", "phab_settings"=>{"mysql.user"=>"root", "mysql.pass"=>"labspass"}}, {"environment"=>"phd", "owner"=>"root", "group"=>"phd", "phab_settings"=>{"mysql.user"=>"root", "mysql.pass"=>"labspass"}}] is not a Hash.
It looks to be a Array at /etc/puppet/modules/phabricator/manifests/init.pp:68 on node deployment-phab02.deployment-prep.eqiad.wmflabs

Reason is modules/role/manifests/phabricator/labs.pp pass an array instead of a hash of (name => config hash).

Used to fail with:

Error: /Stage[main]/Phabricator::Vcs/File[/etc/systemd/system/ssh-phab.service]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/phabricator/sshd-phab.service
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 10 2016, 9:44 PM
hashar edited subscribers, added: mmodell, hashar; removed: 20after4.Oct 11 2016, 7:01 AM

@mmodell what are those deployment-phab01 and deployment-phab02 instances? From the name that seems to suggest to host Phabricator , and I would rather not have them in the deployment-prep project.

@hashar: They are not for hosting phabricator per se, but rather for testing scap deployment of phabricator.

hashar renamed this task from puppet failure on deployment-phab0[12] due to missing expected puppet:///modules/phabricator/sshd-phab.service to puppet failure on deployment-phab01 ... is not a Hash. It looks to be a Array at /etc/puppet/modules/phabricator/manifests/init.pp:68.Dec 12 2016, 9:33 AM
hashar updated the task description. (Show Details)

Change 326401 had a related patch set uploaded (by Hashar):
phabricator: fix passing config on labs

https://gerrit.wikimedia.org/r/326401

Should probably switch these hosts to use the main phabricator class as it now works on labs and is more maintained then the labs class.

deployment-phab01 is working. I just added requisite stuff to the hiera config and deployed via scap from deployment-tin.

hashar removed hashar as the assignee of this task.Dec 14 2016, 9:25 PM

deployment-phab01 got fixed :]

deployment-phab02 fails with a different error now which is related to not being able to upgrade mariadb :(

I had to upgrade mariadb manually on phab01

Perhaps you would be able to help with T153319

Mentioned in SAL (#wikimedia-releng) [2016-12-15T16:08:19Z] <hashar> deployment-phab02 : apt-get upgrade T147818

The mariadb package is broken:

# apt-get install mariadb-client
The following packages have unmet dependencies:
 mariadb-client : Depends: mariadb-client-10.0 (>= 10.0.28-0+deb8u1) but it is not going to be installed

In puppet we install mariadb-client which is current

# apt-cache madison mariadb-client*

libmariadb-client-lgpl-dev2.0.0-1http://httpredir.debian.org/debian/ jessie/main amd64 Packages
mariadb-client10.0.28-0+deb8u1http://security.debian.org/ jessie/updates/main amd64 Packages
mariadb-client10.0.27-0+deb8u1http://httpredir.debian.org/debian/ jessie/main amd64 Packages
mariadb-client-core-10.010.0.28-0+deb8u1http://security.debian.org/ jessie/updates/main amd64 Packages
mariadb-client-core-10.010.0.27-0+deb8u1http://httpredir.debian.org/debian/ jessie/main amd64 Packages
libmariadb-client-lgpl-dev-compat2.0.0-1http://httpredir.debian.org/debian/ jessie/main amd64 Packages
mariadb-client-10.010.0.28-0+deb8u1http://security.debian.org/ jessie/updates/main amd64 Packages
mariadb-client-10.010.0.27-0+deb8u1http://httpredir.debian.org/debian/ jessie/main amd64 Packages

No idea why apt-get install ends up with:

mariadb-client : Depends: mariadb-client-10.0 (>= 10.0.28-0+deb8u1) but it is not going to be installed

ap-cache policy show both have a candidate version of 10.0.28-0+deb8u1.

Change 326401 merged by Dzahn:
phabricator: fix passing config on labs

https://gerrit.wikimedia.org/r/326401

It won't install mariadb-client-10.0 because that package conflicts with a package that is installed already - mysql-client-core-5.5

Paladox added a comment.EditedDec 17 2016, 7:09 PM

You could apt-get --purge remove mysql-client-core-5.5 as it is not the server.

then you can install mariadb-client.

Or just save a backup of the sql database then uninstall the whole of MySQL (remember the password though) then install mariadb (create the password you had previously for MySQL.

Anyways labs phabricator class is being removed in https://gerrit.wikimedia.org/r/327690 so these instances need migrating to the main phabricator class.

I have unbroke deployment-phab01 as part of T153319#3234669 . Seems setting in hiera phabricator_cluster_search: {} has been sufficient.

And the original cause of this task is fixed apparently.

Puppet pass, but the ssh-phab services refuses to start:

Notice: /Stage[main]/Phabricator::Vcs/Base::Service_unit[ssh-phab]/Service[ssh-phab]/ensure: ensure changed 'stopped' to 'running'
hashar renamed this task from puppet failure on deployment-phab01 ... is not a Hash. It looks to be a Array at /etc/puppet/modules/phabricator/manifests/init.pp:68 to puppet failure on deployment-phab01: Service[ssh-phab] refuses to start.May 5 2017, 2:49 PM
hashar updated the task description. (Show Details)
hashar updated the task description. (Show Details)

Probably because it needs a different ip as it can't bound to the same port 22 on the same ip.

I think it's ok for ssh-phab to fail in beta.

mmodell closed this task as Resolved.May 8 2017, 11:11 PM
mmodell claimed this task.
mmodell reassigned this task from mmodell to hashar.