Page MenuHomePhabricator

Create a poolcounter instance in deployment-prep
Closed, ResolvedPublic

Description

For testing of changes to the poolcounter code, we need a poolcounter instance in deployment-prep

Event Timeline

Joe claimed this task.
Joe raised the priority of this task from to High.
Joe updated the task description. (Show Details)
Joe added subscribers: gerritbot, Joe, tstarling and 4 others.

Cannot make a new instance communicate with the deployment-prep puppetmaster. @Andrew any help would be appreciated.

Joe added a subscriber: Andrew.
root@deployment-poolcounter01:/var/lib/puppet# ping deployment-puppetmaster
PING deployment-puppetmaster.deployment-prep.eqiad.wmflabs (10.68.16.63) 56(84) bytes of data.
64 bytes from deployment-puppetmaster.deployment-prep.eqiad.wmflabs (10.68.16.63): icmp_req=1 ttl=64 time=0.353 ms
64 bytes from deployment-puppetmaster.deployment-prep.eqiad.wmflabs (10.68.16.63): icmp_req=2 ttl=64 time=0.369 ms
64 bytes from deployment-puppetmaster.deployment-prep.eqiad.wmflabs (10.68.16.63): icmp_req=3 ttl=64 time=1.68 ms
64 bytes from deployment-puppetmaster.deployment-prep.eqiad.wmflabs (10.68.16.63): icmp_req=4 ttl=64 time=0.516 ms
64 bytes from deployment-puppetmaster.deployment-prep.eqiad.wmflabs (10.68.16.63): icmp_req=5 ttl=64 time=0.309 ms
^C
--- deployment-puppetmaster.deployment-prep.eqiad.wmflabs ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3999ms
rtt min/avg/max/mdev = 0.309/0.646/1.686/0.525 ms
root@deployment-poolcounter01:/var/lib/puppet# telnet deployment-puppetmaster 8140
Trying 10.68.16.63...
telnet: Unable to connect to remote host: Connection timed out

I see this problem and can reproduce it on another instance. No idea as to the cause yet.

This appears to be yet another issue with the nova rolling-upgrade process.

The new instance, deployment-puppetmaster, was running on labvirt1004, one of the nodes I upgraded to Kilo. The puppetmaster was on labvirt1007 which was still running Juno. I just now upgraded labvirt1007 to Kilo and the telnet command started to work.

The network controller is also running Kilo.

So, presumably something with the handshake between nova-network Kilo and nova-compute Juno is buggy. I'll upgrade the remaining virt nodes shortly, and then this issue should stop appearing.

Signed and puppet successfully ran on deployment-poolcounter01.deployment-prep.eqiad.wmflabs