Page MenuHomePhabricator

Rebuilding instances via Horizon gets stuck in forever loop of collecting puppet agent stats
Open, Needs TriagePublic

Description

When I rebuild instances via Horizon, the instance often seems to get stuck in a permanent loop of collecting puppet agent stats (see log output below) and attempting to ssh in while this is occurring results in a Connection closed by UNKNOWN port 65535. I usually give up after a while and just delete the instance and create a new one.

An example from a recent attempt to rebuild the small-druid-test.recommendation.api instance:

...
2022-06-22T10:12:13.201931+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:13:01.694910+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:13:01.804056+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:14:13.087827+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:14:13.193327+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:14:55.891826+00:00 small-druid-test systemd[1]: Started Update Debian version stat exported by node_exporter.
2022-06-22T10:14:55.911512+00:00 small-druid-test systemd[1]: prometheus-debian-version-textfile.service: Succeeded.
2022-06-22T10:15:13.087989+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:15:13.091305+00:00 small-druid-test systemd[1]: Started Regular job to collect active shell session information.
2022-06-22T10:15:13.110898+00:00 small-druid-test systemd[1]: prometheus_ssh_open_sessions.service: Succeeded.
2022-06-22T10:15:13.190103+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:16:01.772286+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:16:01.875666+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:17:13.088205+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:17:13.207519+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:18:13.088090+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:18:13.199615+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:19:01.847062+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:19:01.952856+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T10:19:58.350693+00:00 small-druid-test systemd[1]: Started Update Debian version stat exported by node_exporter.
2022-06-22T10:19:58.370794+00:00 small-druid-test systemd[1]: prometheus-debian-version-textfile.service: Succeeded.
2022-06-22T10:20:13.088230+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T10:20:13.091634+00:00 small-druid-test systemd[1]: Started Regular job to collect active shell session information.
2022-06-22T10:20:13.113213+00:00 small-druid-test systemd[1]: prometheus_ssh_open_sessions.service: Succeeded.
2022-06-22T10:20:13.192828+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
... [many similar lines cut out]
2022-06-22T15:09:13.089722+00:00 small-druid-test systemd[1]: Starting Daily apt download activities...
2022-06-22T15:09:13.091516+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:09:13.229706+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:09:13.803316+00:00 small-druid-test systemd[1]: apt-daily.service: Succeeded.
2022-06-22T15:09:13.803877+00:00 small-druid-test systemd[1]: Finished Daily apt download activities.
2022-06-22T15:10:09.203028+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:10:09.206792+00:00 small-druid-test systemd[1]: Started Regular job to collect active shell session information.
2022-06-22T15:10:09.226572+00:00 small-druid-test systemd[1]: prometheus_ssh_open_sessions.service: Succeeded.
2022-06-22T15:10:09.312891+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:11:13.088632+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:11:13.193478+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:12:05.893117+00:00 small-druid-test systemd[1]: Started Update Debian version stat exported by node_exporter.
2022-06-22T15:12:05.896246+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:12:05.911502+00:00 small-druid-test systemd[1]: prometheus-debian-version-textfile.service: Succeeded.
2022-06-22T15:12:06.006119+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:13:09.278093+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:13:09.387053+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:14:13.090900+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:14:13.203000+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:15:13.089446+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:15:13.094404+00:00 small-druid-test systemd[1]: Started Regular job to collect active shell session information.
2022-06-22T15:15:13.116515+00:00 small-druid-test systemd[1]: prometheus_ssh_open_sessions.service: Succeeded.
2022-06-22T15:15:13.193358+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:16:09.355908+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:16:09.482834+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:17:13.088418+00:00 small-druid-test systemd[1]: Started Update Debian version stat exported by node_exporter.
2022-06-22T15:17:13.094169+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:17:13.106494+00:00 small-druid-test systemd[1]: prometheus-debian-version-textfile.service: Succeeded.
2022-06-22T15:17:13.198517+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:18:13.088610+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:18:13.201817+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:19:09.433719+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:19:09.543068+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:20:13.091072+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:20:13.094582+00:00 small-druid-test systemd[1]: Started Regular job to collect active shell session information.
2022-06-22T15:20:13.114254+00:00 small-druid-test systemd[1]: prometheus_ssh_open_sessions.service: Succeeded.
2022-06-22T15:20:13.197998+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:21:13.089673+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:21:13.191167+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:22:09.506071+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:22:09.618133+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:22:15.892891+00:00 small-druid-test systemd[1]: Started Update Debian version stat exported by node_exporter.
2022-06-22T15:22:15.917714+00:00 small-druid-test systemd[1]: prometheus-debian-version-textfile.service: Succeeded.
2022-06-22T15:23:13.088317+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:23:13.195912+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:24:13.087629+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:24:13.188778+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:25:09.578410+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:25:09.582325+00:00 small-druid-test systemd[1]: Started Regular job to collect active shell session information.
2022-06-22T15:25:09.603950+00:00 small-druid-test systemd[1]: prometheus_ssh_open_sessions.service: Succeeded.
2022-06-22T15:25:09.705790+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:26:13.087638+00:00 small-druid-test systemd[1]: Started Regular job to collect puppet agent stats.
2022-06-22T15:26:13.197102+00:00 small-druid-test systemd[1]: prometheus_puppet_agent_stats.service: Succeeded.
2022-06-22T15:26:27.455673+00:00 small-druid-test systemd[1]: Starting SSSD PAM Service responder...
2022-06-22T15:26:27.461610+00:00 small-druid-test systemd[1]: Started SSSD PAM Service responder.
2022-06-22T15:26:27.485988+00:00 small-druid-test sssd_pam[56032]: Starting up
... [would still be running but I ended it]

Event Timeline

I should add that I'm going to delete this instance because I need its resources and a hard reboot did not solve the issue either. So I assume the logs will be deleted too but in my experience this has happened several times with different VMs so if whoever looks at this can't replicate it, let me know and I'll try on something that can be unreachable for however long it takes to debug etc.