Page MenuHomePhabricator

puppet failures due to "Could not find class" or "Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type"
Closed, ResolvedPublic

Description

The puppet run on deployment-prep instances seems to flap quite often. Shinken send mail notifications about it and shinken-wm IRC Bot is quite spammy.

The failure is always resolved on the next run. Example:

[11:05:55] <shinken-wm> PROBLEM - Puppet run on deployment-redis01 is CRITICAL: CRITICAL: 44.44% of data above the critical threshold [0.0]
[11:11:44] <shinken-wm> RECOVERY - Puppet run on deployment-ms-fe01 is OK: OK: Less than 1.00% above the threshold [0.0]

I suspect it is a race condition between the puppet autorebaser of operations/puppet.git labs/private.git versus puppet master trying to access files while they are being changed by the autorebaser.

1deployment-urldownloader.deployment-prep.eqiad.wmflabs:
2 ?[0;32mInfo: Retrieving plugin?[0m
3 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
4 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
5 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
6 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
7 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
8 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
9 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
10 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
11 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
12 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
13 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::url_downloader for deployment-urldownloader.deployment-prep.eqiad.wmflabs on node deployment-urldownloader.deployment-prep.eqiad.wmflabs?[0m
14 ?[1;31mWarning: Not using cache on failed catalog?[0m
15 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
16deployment-sentry2.deployment-prep.eqiad.wmflabs:
17 ?[0;32mInfo: Retrieving plugin?[0m
18 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
19 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
20 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
21 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
22 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
23 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
24 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
25 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
26 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
27 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
28 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ::standard::mail::sender for deployment-sentry2.deployment-prep.eqiad.wmflabs on node deployment-sentry2.deployment-prep.eqiad.wmflabs?[0m
29 ?[1;31mWarning: Not using cache on failed catalog?[0m
30 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
31deployment-logstash2.deployment-prep.eqiad.wmflabs:
32 ?[0;32mInfo: Retrieving pluginfacts?[0m
33 ?[0;32mInfo: Retrieving plugin?[0m
34 ?[0;32mInfo: Loading facts?[0m
35 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class apt::unattendedupgrades for deployment-logstash2.deployment-prep.eqiad.wmflabs on node deployment-logstash2.deployment-prep.eqiad.wmflabs?[0m
36 ?[1;31mWarning: Not using cache on failed catalog?[0m
37 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
38deployment-sca01.deployment-prep.eqiad.wmflabs:
39 ?[0;32mInfo: Retrieving pluginfacts?[0m
40 ?[0;32mInfo: Retrieving plugin?[0m
41 ?[0;32mInfo: Loading facts?[0m
42 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class base::labs for deployment-sca01.deployment-prep.eqiad.wmflabs on node deployment-sca01.deployment-prep.eqiad.wmflabs?[0m
43 ?[1;31mWarning: Not using cache on failed catalog?[0m
44 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
45deployment-elastic06.deployment-prep.eqiad.wmflabs:
46 ?[0;32mInfo: Retrieving plugin?[0m
47 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
48 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
49 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
50 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
51 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
52 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
53 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
54 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
55 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
56 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
57 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type motd::script at /etc/puppet/modules/system/manifests/role.pp:33 on node deployment-elastic06.deployment-prep.eqiad.wmflabs?[0m
58 ?[1;31mWarning: Not using cache on failed catalog?[0m
59 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
60deployment-db2.deployment-prep.eqiad.wmflabs:
61 ?[0;32mInfo: Retrieving plugin?[0m
62 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
63 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
64 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
65 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
66 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
67 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
68 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
69 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
70 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
71 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
72 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ::salt::minion at /etc/puppet/modules/role/manifests/salt/minions.pp:32 on node deployment-db2.deployment-prep.eqiad.wmflabs?[0m
73 ?[1;31mWarning: Not using cache on failed catalog?[0m
74 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
75deployment-cache-upload04.deployment-prep.eqiad.wmflabs:
76 ?[0;32mInfo: Retrieving pluginfacts?[0m
77 ?[0;32mInfo: Retrieving plugin?[0m
78 ?[0;32mInfo: Loading facts?[0m
79 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type varnish::wikimedia_vcl at /etc/puppet/modules/varnish/manifests/instance.pp:160 on node deployment-cache-upload04.deployment-prep.eqiad.wmflabs?[0m
80 ?[1;31mWarning: Not using cache on failed catalog?[0m
81 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
82deployment-tmh01.deployment-prep.eqiad.wmflabs:
83 ?[0;32mInfo: Retrieving plugin?[0m
84 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
85 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
86 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
87 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
88 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
89 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
90 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
91 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
92 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
93 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
94 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ::trebuchet for deployment-tmh01.deployment-prep.eqiad.wmflabs on node deployment-tmh01.deployment-prep.eqiad.wmflabs?[0m
95 ?[1;31mWarning: Not using cache on failed catalog?[0m
96 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
97deployment-ms-be02.deployment-prep.eqiad.wmflabs:
98 ?[0;32mInfo: Retrieving plugin?[0m
99 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
100 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
101 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
102 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
103 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
104 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
105 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
106 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
107 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
108 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
109 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type apt::conf at /etc/puppet/modules/apt/manifests/unattendedupgrades.pp:11 on node deployment-ms-be02.deployment-prep.eqiad.wmflabs?[0m
110 ?[1;31mWarning: Not using cache on failed catalog?[0m
111 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
112deployment-imagescaler01.deployment-prep.eqiad.wmflabs:
113 ?[0;32mInfo: Retrieving pluginfacts?[0m
114 ?[0;32mInfo: Retrieving plugin?[0m
115 ?[0;32mInfo: Loading facts?[0m
116 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class base::kernel for deployment-imagescaler01.deployment-prep.eqiad.wmflabs on node deployment-imagescaler01.deployment-prep.eqiad.wmflabs?[0m
117 ?[1;31mWarning: Not using cache on failed catalog?[0m
118 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
119deployment-pdf02.deployment-prep.eqiad.wmflabs:
120 ?[0;32mInfo: Retrieving plugin?[0m
121 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/physicalcorecount.rb?[0m
122 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/ec2id.rb?[0m
123 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/default_gateway.rb?[0m
124 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/projectgid.rb?[0m
125 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/lldp.rb?[0m
126 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/pe_version.rb?[0m
127 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/puppet_vardir.rb?[0m
128 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/root_home.rb?[0m
129 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/facter_dot_d.rb?[0m
130 ?[0;32mInfo: Loading facts in /etc/puppet/modules/apt/lib/facter/apt.rb?[0m
131 ?[0;32mInfo: Loading facts in /etc/puppet/modules/puppet_statsd/lib/facter/puppet_config_dir.rb?[0m
132 ?[0;32mInfo: Loading facts in /etc/puppet/modules/vm/lib/facter/meminbytes.rb?[0m
133 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
134 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
135 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
136 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
137 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
138 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
139 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
140 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
141 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
142 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
143 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::ocg for deployment-pdf02.deployment-prep.eqiad.wmflabs on node deployment-pdf02.deployment-prep.eqiad.wmflabs?[0m
144 ?[1;31mWarning: Not using cache on failed catalog?[0m
145 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
146deployment-mediawiki03.deployment-prep.eqiad.wmflabs:
147 ?[0;32mInfo: Retrieving plugin?[0m
148 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
149 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
150 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
151 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
152 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
153 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
154 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
155 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
156 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
157 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
158 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ::apache::mod::proxy_fcgi for deployment-mediawiki03.deployment-prep.eqiad.wmflabs on node deployment-mediawiki03.deployment-prep.eqiad.wmflabs?[0m
159 ?[1;31mWarning: Not using cache on failed catalog?[0m
160 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
161deployment-db1.deployment-prep.eqiad.wmflabs:
162 ?[0;32mInfo: Retrieving plugin?[0m
163 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
164 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
165 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
166 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
167 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
168 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
169 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
170 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
171 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
172 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
173 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ssh::client for deployment-db1.deployment-prep.eqiad.wmflabs on node deployment-db1.deployment-prep.eqiad.wmflabs?[0m
174 ?[1;31mWarning: Not using cache on failed catalog?[0m
175 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
176mira.deployment-prep.eqiad.wmflabs:
177 ?[0;32mInfo: Retrieving plugin?[0m
178 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
179 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
180 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
181 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
182 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
183 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
184 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
185 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
186 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
187 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
188 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class base::syslogs at /etc/puppet/modules/role/manifests/labs/instance.pp:21 on node mira.deployment-prep.eqiad.wmflabs?[0m
189 ?[1;31mWarning: Not using cache on failed catalog?[0m
190 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
191deployment-ms-fe01.deployment-prep.eqiad.wmflabs:
192 ?[0;32mInfo: Retrieving plugin?[0m
193 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
194 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
195 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
196 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
197 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
198 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
199 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
200 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
201 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
202 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
203 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type sudo::group at /etc/puppet/modules/role/manifests/labs/instance.pp:10 on node deployment-ms-fe01.deployment-prep.eqiad.wmflabs?[0m
204 ?[1;31mWarning: Not using cache on failed catalog?[0m
205 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
206deployment-elastic05.deployment-prep.eqiad.wmflabs:
207 ?[0;32mInfo: Retrieving plugin?[0m
208 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
209 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
210 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
211 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
212 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
213 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
214 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
215 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
216 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
217 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
218 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class labstore::traffic_shaping for deployment-elastic05.deployment-prep.eqiad.wmflabs on node deployment-elastic05.deployment-prep.eqiad.wmflabs?[0m
219 ?[1;31mWarning: Not using cache on failed catalog?[0m
220 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
221deployment-zotero01.deployment-prep.eqiad.wmflabs:
222 ?[0;32mInfo: Retrieving plugin?[0m
223 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
224 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
225 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
226 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
227 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
228 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
229 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
230 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
231 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
232 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
233 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class base::syslogs at /etc/puppet/modules/role/manifests/labs/instance.pp:21 on node deployment-zotero01.deployment-prep.eqiad.wmflabs?[0m
234 ?[1;31mWarning: Not using cache on failed catalog?[0m
235 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
236deployment-kafka02.deployment-prep.eqiad.wmflabs:
237 ?[0;32mInfo: Retrieving plugin?[0m
238 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
239 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
240 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
241 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
242 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
243 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
244 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
245 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
246 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
247 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
248 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class security::access for deployment-kafka02.deployment-prep.eqiad.wmflabs on node deployment-kafka02.deployment-prep.eqiad.wmflabs?[0m
249 ?[1;31mWarning: Not using cache on failed catalog?[0m
250 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
251deployment-ores-web.deployment-prep.eqiad.wmflabs:
252 ?[0;32mInfo: Retrieving pluginfacts?[0m
253 ?[0;32mInfo: Retrieving plugin?[0m
254 ?[0;32mInfo: Loading facts?[0m
255 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-ores-web.deployment-prep.eqiad.wmflabs on node deployment-ores-web.deployment-prep.eqiad.wmflabs?[0m
256 ?[1;31mWarning: Not using cache on failed catalog?[0m
257 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
258deployment-puppetmaster.deployment-prep.eqiad.wmflabs:
259 ?[0;32mInfo: Retrieving plugin?[0m
260 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/physicalcorecount.rb?[0m
261 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/initsystem.rb?[0m
262 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/labsprojectfrommetadata.rb?[0m
263 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/lldp.rb?[0m
264 ?[0;32mInfo: Loading facts in /etc/puppet/modules/ganeti/lib/facter/ganeti.rb?[0m
265 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/pe_version.rb?[0m
266 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/root_home.rb?[0m
267 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/puppet_vardir.rb?[0m
268 ?[0;32mInfo: Loading facts in /etc/puppet/modules/apt/lib/facter/apt.rb?[0m
269 ?[0;32mInfo: Loading facts in /etc/puppet/modules/puppet_statsd/lib/facter/puppet_config_dir.rb?[0m
270 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
271 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
272 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
273 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
274 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
275 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
276 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
277 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
278 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
279 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
280 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class labstore::traffic_shaping for deployment-puppetmaster.deployment-prep.eqiad.wmflabs on node deployment-puppetmaster.deployment-prep.eqiad.wmflabs?[0m
281 ?[1;31mWarning: Not using cache on failed catalog?[0m
282 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
283deployment-poolcounter01.deployment-prep.eqiad.wmflabs:
284 ?[0;32mInfo: Retrieving plugin?[0m
285 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
286 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
287 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
288 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
289 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
290 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
291 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
292 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
293 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
294 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
295 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class labstore::traffic_shaping for deployment-poolcounter01.deployment-prep.eqiad.wmflabs on node deployment-poolcounter01.deployment-prep.eqiad.wmflabs?[0m
296 ?[1;31mWarning: Not using cache on failed catalog?[0m
297 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
298deployment-restbase02.deployment-prep.eqiad.wmflabs:
299 ?[0;32mInfo: Retrieving pluginfacts?[0m
300 ?[0;32mInfo: Retrieving plugin?[0m
301 ?[0;32mInfo: Loading facts?[0m
302 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-restbase02.deployment-prep.eqiad.wmflabs on node deployment-restbase02.deployment-prep.eqiad.wmflabs?[0m
303 ?[1;31mWarning: Not using cache on failed catalog?[0m
304 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
305deployment-mathoid.deployment-prep.eqiad.wmflabs:
306 ?[0;32mInfo: Retrieving pluginfacts?[0m
307 ?[0;32mInfo: Retrieving plugin?[0m
308 ?[0;32mInfo: Loading facts?[0m
309 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-mathoid.deployment-prep.eqiad.wmflabs on node deployment-mathoid.deployment-prep.eqiad.wmflabs?[0m
310 ?[1;31mWarning: Not using cache on failed catalog?[0m
311 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
312deployment-elastic08.deployment-prep.eqiad.wmflabs:
313 ?[0;32mInfo: Retrieving plugin?[0m
314 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
315 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
316 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
317 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
318 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
319 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
320 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
321 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
322 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
323 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
324 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class security::access for deployment-elastic08.deployment-prep.eqiad.wmflabs on node deployment-elastic08.deployment-prep.eqiad.wmflabs?[0m
325 ?[1;31mWarning: Not using cache on failed catalog?[0m
326 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
327deployment-memc03.deployment-prep.eqiad.wmflabs:
328 ?[0;32mInfo: Retrieving plugin?[0m
329 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
330 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
331 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
332 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
333 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
334 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
335 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
336 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
337 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
338 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
339 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-memc03.deployment-prep.eqiad.wmflabs on node deployment-memc03.deployment-prep.eqiad.wmflabs?[0m
340 ?[1;31mWarning: Not using cache on failed catalog?[0m
341 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
342deployment-cxserver03.deployment-prep.eqiad.wmflabs:
343 ?[0;32mInfo: Retrieving plugin?[0m
344 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
345 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
346 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
347 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
348 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
349 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
350 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
351 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
352 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
353 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
354 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-cxserver03.deployment-prep.eqiad.wmflabs on node deployment-cxserver03.deployment-prep.eqiad.wmflabs?[0m
355 ?[1;31mWarning: Not using cache on failed catalog?[0m
356 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
357deployment-eventlogging03.deployment-prep.eqiad.wmflabs:
358 ?[0;32mInfo: Retrieving plugin?[0m
359 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
360 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
361 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
362 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
363 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
364 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
365 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
366 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
367 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
368 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
369 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find parent resource type 'role::eventlogging' of type hostclass in production at /etc/puppet/manifests/role/eventlogging.pp:217 on node deployment-eventlogging03.deployment-prep.eqiad.wmflabs?[0m
370 ?[1;31mWarning: Not using cache on failed catalog?[0m
371 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
372deployment-memc02.deployment-prep.eqiad.wmflabs:
373 ?[0;32mInfo: Retrieving plugin?[0m
374 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
375 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
376 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
377 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
378 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
379 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
380 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
381 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
382 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
383 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
384 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-memc02.deployment-prep.eqiad.wmflabs on node deployment-memc02.deployment-prep.eqiad.wmflabs?[0m
385 ?[1;31mWarning: Not using cache on failed catalog?[0m
386 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
387deployment-memc04.deployment-prep.eqiad.wmflabs:
388 ?[0;32mInfo: Retrieving plugin?[0m
389 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
390 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
391 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
392 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
393 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
394 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
395 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
396 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
397 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
398 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
399 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find data item mediawiki::redis_servers::codfw in any Hiera data file and no default supplied at /etc/puppet/manifests/role/memcached.pp:94 on node deployment-memc04.deployment-prep.eqiad.wmflabs?[0m
400 ?[1;31mWarning: Not using cache on failed catalog?[0m
401 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
402deployment-redis02.deployment-prep.eqiad.wmflabs:
403 ?[0;32mInfo: Retrieving plugin?[0m
404 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
405 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
406 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
407 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
408 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
409 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
410 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
411 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
412 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
413 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
414 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-redis02.deployment-prep.eqiad.wmflabs on node deployment-redis02.deployment-prep.eqiad.wmflabs?[0m
415 ?[1;31mWarning: Not using cache on failed catalog?[0m
416 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
417deployment-kafka04.deployment-prep.eqiad.wmflabs:
418 ?[0;32mInfo: Retrieving pluginfacts?[0m
419 ?[0;32mInfo: Retrieving plugin?[0m
420 ?[0;32mInfo: Loading facts?[0m
421 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-kafka04.deployment-prep.eqiad.wmflabs on node deployment-kafka04.deployment-prep.eqiad.wmflabs?[0m
422 ?[1;31mWarning: Not using cache on failed catalog?[0m
423 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
424deployment-mx.deployment-prep.eqiad.wmflabs:
425 ?[0;32mInfo: Retrieving plugin?[0m
426 ?[0;32mInfo: Loading facts in /etc/puppet/modules/ganeti/lib/facter/ganeti.rb?[0m
427 ?[0;32mInfo: Loading facts in /etc/puppet/modules/puppet_statsd/lib/facter/puppet_config_dir.rb?[0m
428 ?[0;32mInfo: Loading facts in /etc/puppet/modules/apt/lib/facter/apt.rb?[0m
429 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/pe_version.rb?[0m
430 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/root_home.rb?[0m
431 ?[0;32mInfo: Loading facts in /etc/puppet/modules/stdlib/lib/facter/puppet_vardir.rb?[0m
432 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/initsystem.rb?[0m
433 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/ec2id.rb?[0m
434 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/physicalcorecount.rb?[0m
435 ?[0;32mInfo: Loading facts in /etc/puppet/modules/base/lib/facter/lldp.rb?[0m
436 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
437 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
438 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
439 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
440 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
441 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
442 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
443 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
444 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
445 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
446 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-mx.deployment-prep.eqiad.wmflabs on node deployment-mx.deployment-prep.eqiad.wmflabs?[0m
447 ?[1;31mWarning: Not using cache on failed catalog?[0m
448 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
449deployment-pdf01.deployment-prep.eqiad.wmflabs:
450 ?[0;32mInfo: Retrieving plugin?[0m
451 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
452 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
453 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
454 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
455 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
456 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
457 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
458 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
459 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
460 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
461 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-pdf01.deployment-prep.eqiad.wmflabs on node deployment-pdf01.deployment-prep.eqiad.wmflabs?[0m
462 ?[1;31mWarning: Not using cache on failed catalog?[0m
463 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
464deployment-mediawiki02.deployment-prep.eqiad.wmflabs:
465 ?[0;32mInfo: Retrieving plugin?[0m
466 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
467 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
468 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
469 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
470 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
471 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
472 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
473 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
474 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
475 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
476 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ::conftool::scripts for deployment-mediawiki02.deployment-prep.eqiad.wmflabs on node deployment-mediawiki02.deployment-prep.eqiad.wmflabs?[0m
477 ?[1;31mWarning: Not using cache on failed catalog?[0m
478 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
479deployment-zookeeper01.deployment-prep.eqiad.wmflabs:
480 ?[0;32mInfo: Retrieving plugin?[0m
481 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
482 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
483 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
484 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
485 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
486 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
487 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
488 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
489 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
490 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
491 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-zookeeper01.deployment-prep.eqiad.wmflabs on node deployment-zookeeper01.deployment-prep.eqiad.wmflabs?[0m
492 ?[1;31mWarning: Not using cache on failed catalog?[0m
493 ?[1;31mError: Could not retrieve catalog; skipping run?[0m
494deployment-stream.deployment-prep.eqiad.wmflabs:
495 ?[0;32mInfo: Retrieving plugin?[0m
496 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/initsystem.rb?[0m
497 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_config_dir.rb?[0m
498 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/labsprojectfrommetadata.rb?[0m
499 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/physicalcorecount.rb?[0m
500 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/apt.rb?[0m
501 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb?[0m
502 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/lldp.rb?[0m
503 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/ganeti.rb?[0m
504 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/root_home.rb?[0m
505 ?[0;32mInfo: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb?[0m
506 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-stream.deployment-prep.eqiad.wmflabs on node deployment-stream.deployment-prep.eqiad.wmflabs?[0m
507 ?[1;31mWarning: Not using cache on failed catalog?[0m
508 ?[1;31mError: Could not retrieve catalog; skipping run?[0m

1deployment-mediawiki03.deployment-prep.eqiad.wmflabs:
2 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type diamond::collector at /etc/puppet/modules/nutcracker/manifests/monitoring.pp:18 on node deployment-mediawiki03.deployment-prep.eqiad.wmflabs?[0m
3deployment-restbase02.deployment-prep.eqiad.wmflabs:
4 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type nrpe::check at /etc/puppet/modules/nrpe/manifests/monitor_service.pp:39 on node deployment-restbase02.deployment-prep.eqiad.wmflabs?[0m
5deployment-ms-fe01.deployment-prep.eqiad.wmflabs:
6 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type sysctl::conffile at /etc/puppet/modules/sysctl/manifests/parameters.pp:41 on node deployment-ms-fe01.deployment-prep.eqiad.wmflabs?[0m
7mira.deployment-prep.eqiad.wmflabs:
8 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type ferm::rule at /etc/puppet/modules/base/manifests/firewall.pp:43 on node mira.deployment-prep.eqiad.wmflabs?[0m
9
10deployment-eventlogging03.deployment-prep.eqiad.wmflabs:
11 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Variable failed with error RuntimeError: class role::eventlogging could not be found at /etc/puppet/manifests/role/eventlogging.pp:278 on node deployment-eventlogging03.deployment-prep.eqiad.wmflabs?[0m
12
13deployment-mx.deployment-prep.eqiad.wmflabs:
14 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class nrpe for deployment-mx.deployment-prep.eqiad.wmflabs on node deployment-mx.deployment-prep.eqiad.wmflabs?[0m
15deployment-db2.deployment-prep.eqiad.wmflabs:
16 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class sudo for deployment-db2.deployment-prep.eqiad.wmflabs on node deployment-db2.deployment-prep.eqiad.wmflabs?[0m
17deployment-kafka04.deployment-prep.eqiad.wmflabs:
18 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class base::labs for deployment-kafka04.deployment-prep.eqiad.wmflabs on node deployment-kafka04.deployment-prep.eqiad.wmflabs?[0m
19deployment-cache-upload04.deployment-prep.eqiad.wmflabs:
20 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class labs_lvm for deployment-cache-upload04.deployment-prep.eqiad.wmflabs on node deployment-cache-upload04.deployment-prep.eqiad.wmflabs?[0m
21deployment-stream.deployment-prep.eqiad.wmflabs:
22 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class base::labs for deployment-stream.deployment-prep.eqiad.wmflabs on node deployment-stream.deployment-prep.eqiad.wmflabs?[0m
23deployment-parsoid05.deployment-prep.eqiad.wmflabs:
24 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class labs_lvm for deployment-parsoid05.deployment-prep.eqiad.wmflabs on node deployment-parsoid05.deployment-prep.eqiad.wmflabs?[0m
25deployment-cache-text04.deployment-prep.eqiad.wmflabs:
26 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class conftool::scripts for deployment-cache-text04.deployment-prep.eqiad.wmflabs on node deployment-cache-text04.deployment-prep.eqiad.wmflabs?[0m
27deployment-mediawiki02.deployment-prep.eqiad.wmflabs:
28 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-mediawiki02.deployment-prep.eqiad.wmflabs on node deployment-mediawiki02.deployment-prep.eqiad.wmflabs?[0m
29deployment-ores-web.deployment-prep.eqiad.wmflabs:
30 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-ores-web.deployment-prep.eqiad.wmflabs on node deployment-ores-web.deployment-prep.eqiad.wmflabs?[0m
31deployment-cxserver03.deployment-prep.eqiad.wmflabs:
32 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-cxserver03.deployment-prep.eqiad.wmflabs on node deployment-cxserver03.deployment-prep.eqiad.wmflabs?[0m
33deployment-elastic08.deployment-prep.eqiad.wmflabs:
34 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::puppet::self for deployment-elastic08.deployment-prep.eqiad.wmflabs on node deployment-elastic08.deployment-prep.eqiad.wmflabs?[0m
35
36deployment-ms-be01.deployment-prep.eqiad.wmflabs:
37 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Parameter not defined! at /etc/puppet/modules/monitoring/manifests/service.pp:22 on node deployment-ms-be01.deployment-prep.eqiad.wmflabs?[0m
38deployment-poolcounter01.deployment-prep.eqiad.wmflabs:
39 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Parameter not defined! at /etc/puppet/modules/monitoring/manifests/service.pp:22 on node deployment-poolcounter01.deployment-prep.eqiad.wmflabs?[0m
40deployment-db1.deployment-prep.eqiad.wmflabs:
41 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Parameter not defined! at /etc/puppet/modules/monitoring/manifests/service.pp:22 on node deployment-db1.deployment-prep.eqiad.wmflabs?[0m
42deployment-urldownloader.deployment-prep.eqiad.wmflabs:
43 ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Parameter not defined! at /etc/puppet/modules/monitoring/manifests/service.pp:22 on node deployment-urldownloader.deployment-prep.eqiad.wmflabs?[0m

Event Timeline

Krenair created this task.Apr 6 2016, 4:03 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 6 2016, 4:03 PM
Andrew added a subscriber: Andrew.Apr 6 2016, 7:42 PM

I tidied up apt a bit on that instance (which shouldn't have been related) and how I see...

root@deployment-cache-parsoid05:~# puppet agent -tv
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::cache::parsoid for deployment-cache-parsoid05.deployment-prep.eqiad.wmflabs on node deployment-cache-parsoid05.deployment-prep.eqiad.wmflabs
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

That looks legit! role::cache::parsoid is checked in the instance config page and that role seems to not exist.

Andrew added a comment.Apr 6 2016, 7:43 PM

Of course the more upsetting error that Krenair saw may resurface at any moment :(

The role::cache::parsoid error likely stems from rOPUP6d215ed1573b2eee1295506bd9799993ebb3d014 where it looks like the parsoid cache servers in prod were moved to be role::cache::misc machines.

The broader issue here is that beta cluster is continuously broken by changes in puppet as evidenced by the 37 machines shown in @Krenair 's paste that don't have clean puppet runs.

This task needs to be broken down into more manageable chunks.

It's more like an ongoing battle than a task we can perform once and forget about, so maybe turn this into a tracker of some sort?

greg added a subscriber: greg.Apr 6 2016, 9:04 PM

"Things broken by puppet changes not tested in Beta Cluster"? :)

Krenair added a comment.EditedApr 9 2016, 4:36 PM

A lot of these are not so easily explained however:

deployment-urldownloader no longer fails
deployment-sentry2 now fails with a different error:

Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find data item redis::shards in any Hiera data file and no default supplied at /etc/puppet/manifests/role/jobqueue_redis.pp:8 on node deployment-sentry2.deployment-prep.eqiad.wmflabs

deployment-logstash2 no longer fails
deployment-sca01 now fails with different errors:

Error: Could not set home on user[citoid]: Execution of '/usr/sbin/usermod -d /nonexistent citoid' returned 8: usermod: user citoid is currently used by process 455
Error: /Stage[main]/Citoid/Service::Node[citoid]/User[citoid]/home: change from /home/citoid to /nonexistent failed: Could not set home on user[citoid]: Execution of '/usr/sbin/usermod -d /nonexistent citoid' returned 8: usermod: user citoid is currently used by process 455
Notice: /Stage[main]/Citoid/Service::Node[citoid]/File[/var/log/citoid]: Dependency User[citoid] has failures: true
Warning: /Stage[main]/Citoid/Service::Node[citoid]/File[/var/log/citoid]: Skipping because of failed dependencies
Error: Could not set home on user[graphoid]: Execution of '/usr/sbin/usermod -d /nonexistent graphoid' returned 8: usermod: user graphoid is currently used by process 457
Error: /Stage[main]/Graphoid/Service::Node[graphoid]/User[graphoid]/home: change from /home/graphoid to /nonexistent failed: Could not set home on user[graphoid]: Execution of '/usr/sbin/usermod -d /nonexistent graphoid' returned 8: usermod: user graphoid is currently used by process 457
Notice: /Stage[main]/Graphoid/Service::Node[graphoid]/File[/var/log/graphoid]: Dependency User[graphoid] has failures: true
Warning: /Stage[main]/Graphoid/Service::Node[graphoid]/File[/var/log/graphoid]: Skipping because of failed dependencies
Notice: /Stage[main]/Citoid/Service::Node[citoid]/Base::Service_unit[citoid]/Service[citoid]: Dependency User[citoid] has failures: true
Warning: /Stage[main]/Citoid/Service::Node[citoid]/Base::Service_unit[citoid]/Service[citoid]: Skipping because of failed dependencies
Notice: /Stage[main]/Graphoid/Service::Node[graphoid]/Base::Service_unit[graphoid]/Service[graphoid]: Dependency User[graphoid] has failures: true
Warning: /Stage[main]/Graphoid/Service::Node[graphoid]/Base::Service_unit[graphoid]/Service[graphoid]: Skipping because of failed dependencies
Error: Execution of '/usr/bin/deploy-local --repo ores/deploy -D log_json:False' returned 70: http://deployment-tin.deployment-prep.eqiad.wmflabs/ores/deploy/.git

Error: /Stage[main]/Ores::Scapdeploy/Scap::Target[ores/deploy]/Package[ores/deploy]/ensure: change from absent to present failed: Execution of '/usr/bin/deploy-local --repo ores/deploy -D log_json:False' returned 70: http://deployment-tin.deployment-prep.eqiad.wmflabs/ores/deploy/.git

Notice: /Stage[main]/Ores::Scapdeploy/Ores::Config[main]/File[/srv/ores/deploy/config/99-main.yaml]: Dependency Package[ores/deploy] has failures: true
Warning: /Stage[main]/Ores::Scapdeploy/Ores::Config[main]/File[/srv/ores/deploy/config/99-main.yaml]: Skipping because of failed dependencies

I'm going to make separate tasks for the legitimate-seeming puppet failures, this one was really for the crazy ones.

Today I set up a tracking task at T132259 with blockers for each of the issues I could find in deployment-prep, and either assigned them to the person responsible or added a relevant project.

Krenair updated the task description. (Show Details)Apr 10 2016, 7:55 PM
Krenair added a project: Puppet.

I think it's something to do with running puppet over salt:
?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class ::base for sm1.servermon.eqiad.wmflabs on node sm1.servermon.eqiad.wmflabs?[0m

Krenair renamed this task from deployment-prep puppet failures due to "Could not find class" or "Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type" to puppet failures due to "Could not find class" or "Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type".Apr 22 2016, 4:41 AM
mmodell added a subscriber: mmodell.EditedApr 22 2016, 4:57 AM

I'd like to open the broader discussion of accountability for breaking beta. Changes to production are obviously not being tested on beta first even though that's ostensibly the policy.

/me shrugs.

Anyone have a suggestion for the best place to have that discussion?

Sep 11 20:40:24 deployment-mathoid puppet-agent[19129]: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type sysctl::parameters at /etc/puppet/modules/base/manifests/sysctl.pp:35 on node deployment-mathoid.deployment-prep.eqiad.wmflabs
Sep 12 00:01:48 deployment-ms-fe01 puppet-agent[16051]: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class ldap::client::nss at /etc/puppet/modules/ldap/manifests/client/includes.pp:25 on node deployment-ms-fe01.deployment-prep.eqiad.wmflabs
Sep 12 02:10:46 deployment-mathoid puppet-agent[30097]: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Invalid resource type apt::conf at /etc/puppet/modules/apt/manifests/init.pp:114 on node deployment-mathoid.deployment-prep.eqiad.wmflabs
Sep 12 02:20:27 deployment-mediawiki02 puppet-agent[2478]: Could not retrieve catalog from remote server: Error 400 on SERVER: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class puppet_statsd at /etc/puppet/modules/base/manifests/puppet.pp:56 on node deployment-mediawiki02.deployment-prep.eqiad.wmflabs
Sep 12 03:10:58 deployment-cache-text04 puppet-agent[23959]: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class grub::defaults for deployment-cache-text04.deployment-prep.eqiad.wmflabs on node deployment-cache-text04.deployment-prep.eqiad.wmflabs
Joe added a subscriber: Joe.Sep 12 2016, 5:59 AM

I'd like to open the broader discussion of accountability for breaking beta. Changes to production are obviously not being tested on beta first even though that's ostensibly the policy.

/me shrugs.

Anyone have a suggestion for the best place to have that discussion?

Who ever said that production infrastructure (puppet, at least) changes should first be tested in beta, which has a completely different infrastructure/puppet code?

AFAIK no one ever said that was the policy. And if someone did, that would be pointless and a huge waste of time for anyone working on production, given how hard is to test quite a few things in beta.

We should be accountable for how easily we make beta diverge from production: the point is beta should /NOT/ break because a change in production that is non-breaking.

Also, I think we should get rid of the beta puppetmaster and use a specialized environment on the labs puppetmaster to make things easier to manage/distinguish.

AlexMonk-WMF added a comment.EditedSep 12 2016, 6:43 AM

I'd like to open the broader discussion of accountability for breaking beta. Changes to production are obviously not being tested on beta first even though that's ostensibly the policy.

/me shrugs.

Anyone have a suggestion for the best place to have that discussion?

Who ever said that production infrastructure (puppet, at least) changes should first be tested in beta, which has a completely different infrastructure/puppet code?

It's not completely different. It's very similar. There is a relatively small amount of divergence that is maintained, some to support the basic differences between production and labs systems, some of which is to support new things being tested in beta before production as they should be (not always entirely at the puppet level), and sometimes I'll admit for historical reasons that we haven't had time to deal with yet like the swift in beta setup me and Filippo did earlier this year.

that would be pointless and a huge waste of time for anyone working on production, given how hard is to test quite a few things in beta.

It would not be pointless for most things, it would not be a waste of time, and it is not hard to test things in beta.

We should be accountable for how easily we make beta diverge from production: the point is beta should /NOT/ break because a change in production that is non-breaking.

So basically, any change that works in production is expected to work in beta without regard to the fact that we operate under different constraints?
And when a change breaks beta, it's our job as beta project admins to clean up after the change author/merger's mess, not theirs?

! In T131946#2627467, @Joe wrote:
Also, I think we should get rid of the beta puppetmaster and use a specialized environment on the labs puppetmaster to make things easier to manage/distinguish.

How would that help things exactly? What would be easier about it? It sounds like that could even make it harder to keep beta functional.

Joe added a comment.EditedSep 12 2016, 7:04 AM

I'd like to open the broader discussion of accountability for breaking beta. Changes to production are obviously not being tested on beta first even though that's ostensibly the policy.

/me shrugs.

Anyone have a suggestion for the best place to have that discussion?

Who ever said that production infrastructure (puppet, at least) changes should first be tested in beta, which has a completely different infrastructure/puppet code?

It's not completely different. It's very similar. There is a relatively small amount of divergence that is maintained, some to support the basic differences between production and labs systems, some of which is to support new things being tested in beta before production as they should be (not always entirely at the puppet level), and sometimes I'll admit for historical reasons that we haven't had time to deal with yet like the swift in beta setup me and Filippo did earlier this year.

LVS, varnish, conftool, puppet, deployments, account management are still significantly diverging between beta and prod, just off the top of my head. I can go on checking all the differences in our puppet codebase too, but the differences are wide enough that testing infrastructural changes in beta is pointless in a lot of cases.

that would be pointless and a huge waste of time for anyone working on production, given how hard is to test quite a few things in beta.

It would not be pointless for most things, it would not be a waste of time, and it is not hard to test things in beta.

I beg to differ. Pick all the changes I did to ops/puppet in the last month or so and tell me which ones:

  • would have been applicable to beta
  • would have benefited by being tested in beta /first/
  • would have been reasonably expected to break beta.

It's about 60 changes https://gerrit.wikimedia.org/r/#/q/author:glavagetto%2540wikimedia.org+status:merged+project:operations/puppet and apart from a couple of one-liners the only relevant ones are the ones for redis (but again, we don't have jobrunners in beta, or redis-based sessions, do we). It's well below 15% of those changes, so yes, testing everything in beta before merging in production is both pointless and a waste of time.

Also, next time instead of just saying "no" to what I said, actually argumenting would be appreciated, thanks.

We should be accountable for how easily we make beta diverge from production: the point is beta should /NOT/ break because a change in production that is non-breaking.

So basically, any change that works in production is expected to work in beta without regard to the fact that we operate under different constraints?
And when a change breaks beta, it's our job as beta project admins to clean up after the change author/merger's mess, not theirs?

Read what I wrote and what you extracted from my sentence: how the two things relate? I said that if we were able to reduce the differences between beta and prod at an infrastructural level, changes in the prod structure should not break beta, unless we forget e.g. to fill in the hiera variables there.

What I was saying is that we're curing the symptom, not the original issue.

Also, I think we should get rid of the beta puppetmaster and use a specialized environment on the labs puppetmaster to make things easier to manage/distinguish.

How would that help things exactly? What would be easier about it? It sounds like that could even make it harder to keep beta functional.

It would make it harder to keep around tens of cherry-picks (that in some cases have been rejected by ops for good reasons), having a specialized repo people have to commit beta "hacks" to would make those more visible even to people that don't ssh to the beta puppetmaster daily, and put some pressure on everyone to keep those differences to a minimum.

I am just stating that I never signed up for testing all puppet changes in beta first (because, as I argumented above, it's mostly pointless and a waste of time). I have never said we should not care about breaking beta; it is a bit lower priority than not breaking production, though.

Probably, having some mechanism to warn a committer "hey, beta puppet broke after your commit, please check" could help make people more aware of problems generated and go fix those.

Finally, given the tone and attitude of your response, I am not interested in participating in this conversation any longer, unless it changes.

This comment was removed by hashar.
hashar removed a subscriber: cscott.Sep 12 2016, 8:38 AM

Sorry I have commented on the wrong task. My removed comment was about T145343 (out of disk on deployment-pdf01)

There is a bit of a misunderstanding about how beta differs from production. It is actually much closer than people think!

LVS, varnish, conftool, puppet, deployments, account management are still significantly diverging between beta and prod, just off the top of my head. I can go on checking all the differences in our puppet codebase too, but the differences are wide enough that testing infrastructural changes in beta is pointless in a lot of cases.

We can not use LVS for the beta cluster since the OpenStack network stack we currently use would not support LVS. What we came up with is that the Varnish backends are speaking directly to the MW app servers. For LVS there might a workaround though see T97333 . There is no IPv6 either (T37947).

I am entirely sure that the mobile team has used beta to test out their Varnish VCL modifications. Analytics use it for Kafka and their event gathering stack, I have caught a few issues on beta before they would have hit production. It definitely serves a purpose.

For deployments, the migration of services to scap3 have been made on beta cluster, used as a play ground area before switching production to it. MediaWiki is automatically deployed via a Jenkins job that runs scap just like on prod including the keyholder system. We use beta to validate newer version of scap before bumping the package in prod.

A lot of work has been made to reduce the puppet differences between beta and prod. That was a sprint early in 2016 iirc which nicely dropped a lot of ::beta puppet classes in favor of using hiera to handle varying bits of config.

the only relevant ones are the ones for redis (but again, we don't have jobrunners in beta, or redis-based sessions, do we).

If one made changes to redis/nutcracker/jobrunner, that can be equally tested on beta cluster. When we switched to use the mediawiki/services/jobrunner service (which is a PHP loop over the jobqueue that hits HHVM /rpc/RunJobs.php), that has been done first on beta. The setup is closely matching the one in production (same code, hhvm for rpc, redis etc), the only delta being the number of jobs being processed by queue since there is a single instance. That is provisioned via hiera.

The sessions on beta has been using whatever session storage we use in production for years. Currently it uses the exact same configuration as production, MediaWiki is being pointed to a local Nutcracker that relays to Redis. That has been of great help to test out the huge AuthManager overhaul.

Just a few counter arguments to the claim that beta is largely different from production which is a common misunderstanding.


As for changes landing in production and breaking beta, that is unavoidable currently. Typically production migrates first with beta cluster admins trying to catch up.

We could surely have some migrations to be handled on beta before production. A couple examples are: MediaWiki app servers migrating to Jessie, that is going to be done on beta via T143536 or the work on Varnish 4 (beta is still on Varnish 3).


As for puppet cherry picks, there is a task to rethink the cherry pick process: T135427, though it derailed to get the existing one reviewed and merged. There are currently eight cherry picks, most having a task associated and being preparation work for production.

Collecting puppet breakage and exposing them would be quite nice. Not sure how we can track that such and such puppet change ended up breaking a given an instance. It would be interesting so figure out a solution for that.

I agree the patches cherry picked should be better exposed. But that is probably better discussed on T135427.

Cleaned up the private repo from old tags with commands such as:

git tag|grep snapshot-2015|xargs git tag --delete

Change 312748 had a related patch set uploaded (by Hashar):
puppetmaster: git-sync-upstream early abort

https://gerrit.wikimedia.org/r/312748

Change 312748 merged by Filippo Giunchedi:
puppetmaster: git-sync-upstream early abort

https://gerrit.wikimedia.org/r/312748

My git-sync-upstream patch might have helped some corner cases. The deployment-prep puppetmaster has been upgraded to puppet 3.8.5 on Monday and seems the spam is mostly gone now!

Can be checked via https://lists.wikimedia.org/pipermail/betacluster-alerts/2016-October/date.html

It would make it harder to keep around tens of cherry-picks (that in some cases have been rejected by ops for good reasons), having a specialized repo people have to commit beta "hacks" to would make those more visible even to people that don't ssh to the beta puppetmaster daily, and put some pressure on everyone to keep those differences to a minimum.

I agree that having them more visible would be a good thing, and it might make things easier to manage - rebasing on the puppetmaster is a really messy way to do things.

My git-sync-upstream patch might have helped some corner cases. The deployment-prep puppetmaster has been upgraded to puppet 3.8.5 on Monday and seems the spam is mostly gone now!

Can be checked via https://lists.wikimedia.org/pipermail/betacluster-alerts/2016-October/date.html

Nice!

Another thing that might improve the noisy alerts is to raise the threshold for notifications - one failed puppet run shouldn't trigger an alert. I'm not sure what the threshold is currently but maybe we should raise it just a bit.

Has anyone seen this problem recently?

Krenair closed this task as Resolved.Jul 13 2018, 8:32 PM

I haven't.