Page MenuHomePhabricator

labpuppetmaster - Slowness and puppet-enc errors
Closed, ResolvedPublic

Description

root@quarry-worker-02:~# puppet agent -t
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: Error 500 on SERVER: Server Error: Failed to find quarry-worker-02.quarry.eqiad.wmflabs via exec: Execution of '/usr/local/bin/puppet-enc quarry-worker-02.quarry.eqiad.wmflabs' returned 1: 
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Failed when searching for node quarry-worker-02.quarry.eqiad.wmflabs: Failed to find quarry-worker-02.quarry.eqiad.wmflabs via exec: Execution of '/usr/local/bin/puppet-enc quarry-worker-02.quarry.eqiad.wmflabs' returned 1: 
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

Event Timeline

GTirloni created this task.Feb 21 2019, 9:07 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2019, 9:07 AM

labpuppetmaster1001 stuck processes:

46762 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
 1002 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc kdc.analytics.eqiad.wmflabs
46795 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
  786 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc hadoop-worker-1.analytics.eqiad.wmflabs
46868 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
46886 ?        Sl     0:00 Passenger RackApp: /usr/share/puppet/rack/puppet-master
47017 ?        Sl     0:00 Passenger RackApp: /usr/share/puppet/rack/puppet-master
47046 ?        Sl     0:00 Passenger RackApp: /usr/share/puppet/rack/puppet-master
  774 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc ores-web-01.ores.eqiad.wmflabs
47088 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
  778 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc mwoffliner2.mwoffliner.eqiad.wmflabs
47099 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
  992 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc quarry-worker-02.quarry.eqiad.wmflabs
47115 ?        Sl     0:00 Passenger RackApp: /usr/share/puppet/rack/puppet-master
  791 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc canary-an1002-01.testlabs.eqiad.wmflabs
47322 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
47597 ?        Sl     0:00 Passenger RackApp: /usr/share/puppet/rack/puppet-master
47611 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master
  781 ?        Ss     0:00  \_ /usr/bin/python3 /usr/local/bin/puppet-enc af-netbox.automation-framework.eqiad.wmflabs
47642 ?        Sl     0:00 Passenger RackApp: /usr/share/puppet/rack/puppet-master
47660 ?        Sl     0:01 Passenger RackApp: /usr/share/puppet/rack/puppet-master

/var/log/syslog:

Feb 21 09:01:15 labpuppetmaster1001 puppet-master[47115]: Server Error: Failed to find phragile-pro.phragile.eqiad.wmflabs via exec: Execution of '/usr/local/bin/puppet-enc phragile-pro.phragile.eqiad.wmflabs' returned 1:

/var/log/nginx/error.log:

2019/02/21 08:55:49 [error] 44081#44081: *2008 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 10.68.20.74, server: , request: "GET /v1/tools/node/tools-proxy-04.tools.eqiad.wmflabs HTTP/1.1", upstream: "http://127.0.0.1:8101/v1/tools/node/tools-proxy-04.tools.eqiad.wmflabs", host: "labs-puppetmaster.wikimedia.org:8100"
2019/02/21 08:55:49 [error] 44061#44061: *2010 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.16.3.17, server: , request: "GET /v1/integration/node/integration-r-lang-01.integration.eqiad.wmflabs HTTP/1.1", upstream: "http://127.0.0.1:8101/v1/integration/node/integration-r-lang-01.integration.eqiad.wmflabs", host: "labs-puppetmaster.wikimedia.org:8100"
# systemctl status uwsgi-labspuppetbackend
● uwsgi-labspuppetbackend.service - uwsgi-labspuppetbackend uwsgi app
   Loaded: loaded (/lib/systemd/system/uwsgi-labspuppetbackend.service; enabled)
   Active: active (running) since Thu 2018-12-20 20:51:36 UTC; 2 months 1 days ago
 Main PID: 1959 (uwsgi)
   CGroup: /system.slice/uwsgi-labspuppetbackend.service
           ├─1959 /usr/bin/uwsgi --die-on-term --ini /etc/uwsgi/apps-enabled/labspuppetbackend.ini
           └─1961 /usr/bin/uwsgi --die-on-term --ini /etc/uwsgi/apps-enabled/labspuppetbackend.ini
root@labpuppetmaster1001:~# lsof -p 1961
COMMAND  PID     USER   FD   TYPE             DEVICE SIZE/OFF      NODE NAME
uwsgi   1961 www-data  cwd    DIR                9,0     4096         2 /
uwsgi   1961 www-data  rtd    DIR                9,0     4096         2 /
uwsgi   1961 www-data  txt    REG                9,0   801664   2360918 /usr/bin/uwsgi-core
uwsgi   1961 www-data  mem    REG                9,0   199328   2372775 /usr/lib/python3/dist-packages/_yaml.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  DEL    REG                9,0            2382222 /usr/lib/python3.4/lib-dynload/termios.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  mem    REG                9,0    84856    134221 /lib/x86_64-linux-gnu/libresolv-2.19.so
uwsgi   1961 www-data  mem    REG                9,0    22952    134215 /lib/x86_64-linux-gnu/libnss_dns-2.19.so
uwsgi   1961 www-data  mem    REG                9,0    42416   2381096 /usr/lib/python3/dist-packages/simplejson/_speedups.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  mem    REG                9,0   233288   2370862 /usr/lib/x86_64-linux-gnu/libmpdec.so.2.4.1
uwsgi   1961 www-data  DEL    REG                9,0            2382225 /usr/lib/python3.4/lib-dynload/_decimal.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  mem    REG                9,0    35152   2360694 /usr/lib/x86_64-linux-gnu/libffi.so.6.0.2
uwsgi   1961 www-data  DEL    REG                9,0            2382238 /usr/lib/python3.4/lib-dynload/_ctypes.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  mem    REG                9,0    10592   2758027 /usr/lib/python3/dist-packages/markupsafe/_speedups.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  mem    REG                9,0    66856    131079 /lib/x86_64-linux-gnu/libbz2.so.1.0.4
uwsgi   1961 www-data  DEL    REG                9,0            2382216 /usr/lib/python3.4/lib-dynload/_bz2.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  DEL    REG                9,0            2371127 /usr/lib/python3.4/lib-dynload/_ssl.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  DEL    REG                9,0            2371125 /usr/lib/python3.4/lib-dynload/_hashlib.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  DEL    REG                9,0            2371126 /usr/lib/python3.4/lib-dynload/_opcode.cpython-34m-x86_64-linux-gnu.so
uwsgi   1961 www-data  DEL    REG                0,5          243807678 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807677 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807676 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807675 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807674 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807673 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807672 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807671 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807670 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807669 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807668 /dev/zero
uwsgi   1961 www-data  mem    REG                9,0   165856    131089 /lib/x86_64-linux-gnu/libexpat.so.1.6.0
uwsgi   1961 www-data  mem    REG                9,0    31784    134222 /lib/x86_64-linux-gnu/librt-2.19.so
uwsgi   1961 www-data  DEL    REG                9,0            2359751 /usr/lib/x86_64-linux-gnu/libpython3.4m.so.1.0
uwsgi   1961 www-data  mem    REG                9,0    10680    134225 /lib/x86_64-linux-gnu/libutil-2.19.so
uwsgi   1961 www-data  mem    REG                9,0   187616    394994 /usr/lib/uwsgi/plugins/python34_plugin.so
uwsgi   1961 www-data  mem    REG                9,0    47712    134216 /lib/x86_64-linux-gnu/libnss_files-2.19.so
uwsgi   1961 www-data  mem    REG                9,0    43592    134218 /lib/x86_64-linux-gnu/libnss_nis-2.19.so
uwsgi   1961 www-data  mem    REG                9,0    89104    134213 /lib/x86_64-linux-gnu/libnsl-2.19.so
uwsgi   1961 www-data  mem    REG                9,0    31632    134214 /lib/x86_64-linux-gnu/libnss_compat-2.19.so
uwsgi   1961 www-data  mem    REG                9,0   141752    131171 /lib/x86_64-linux-gnu/liblzma.so.5.0.0
uwsgi   1961 www-data  mem    REG                9,0    18640    131075 /lib/x86_64-linux-gnu/libattr.so.1.1.0
uwsgi   1961 www-data  mem    REG                9,0  1738176    134207 /lib/x86_64-linux-gnu/libc-2.19.so
uwsgi   1961 www-data  mem    REG                9,0  1465816   2359742 /usr/lib/x86_64-linux-gnu/libxml2.so.2.9.1
uwsgi   1961 www-data  mem    REG                9,0  2510848   2360656 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
uwsgi   1961 www-data  mem    REG                9,0   442552   2359402 /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0
uwsgi   1961 www-data  mem    REG                9,0    55952   2359793 /usr/lib/x86_64-linux-gnu/libjansson.so.4.7.0
uwsgi   1961 www-data  mem    REG                9,0   125160   2369947 /usr/lib/x86_64-linux-gnu/libyaml-0.so.2.0.4
uwsgi   1961 www-data  mem    REG                9,0    18904    131375 /lib/x86_64-linux-gnu/libuuid.so.1.3.0
uwsgi   1961 www-data  mem    REG                9,0    19016    131121 /lib/x86_64-linux-gnu/libcap.so.2.24
uwsgi   1961 www-data  mem    REG                9,0   448440    131158 /lib/x86_64-linux-gnu/libpcre.so.3.13.1
uwsgi   1961 www-data  mem    REG                9,0   109144    131381 /lib/x86_64-linux-gnu/libz.so.1.2.8
uwsgi   1961 www-data  mem    REG                9,0    14664    134210 /lib/x86_64-linux-gnu/libdl-2.19.so
uwsgi   1961 www-data  mem    REG                9,0  1051056    134211 /lib/x86_64-linux-gnu/libm-2.19.so
uwsgi   1961 www-data  mem    REG                9,0   137384    134203 /lib/x86_64-linux-gnu/libpthread-2.19.so
uwsgi   1961 www-data  mem    REG                9,0   140928    134204 /lib/x86_64-linux-gnu/ld-2.19.so
uwsgi   1961 www-data  DEL    REG                0,5          243807667 /dev/zero
uwsgi   1961 www-data  mem    REG                9,0    26258   2363348 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
uwsgi   1961 www-data  mem    REG                9,0  1607712   2362245 /usr/lib/locale/locale-archive
uwsgi   1961 www-data  DEL    REG                0,5          243807665 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807664 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807663 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807662 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807661 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807660 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807659 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807658 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807657 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807656 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807655 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807654 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807653 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807652 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807651 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807650 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807649 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807648 /dev/zero
uwsgi   1961 www-data  DEL    REG                0,5          243807642 /dev/zero
uwsgi   1961 www-data    0r   CHR                1,3      0t0      1028 /dev/null
uwsgi   1961 www-data    1u  unix 0xffff8bd182b34000      0t0 243843117 socket
uwsgi   1961 www-data    2u  unix 0xffff8bd182b34000      0t0 243843117 socket
uwsgi   1961 www-data    3u  IPv4          243807666      0t0       TCP *:8101 (LISTEN)
uwsgi   1961 www-data    4u  0000               0,11        0     11970 anon_inode
uwsgi   1961 www-data    5u  unix 0xffff8bce6c956c00      0t0 243807680 socket
uwsgi   1961 www-data    6r   CHR                1,9      0t0      1033 /dev/urandom
uwsgi   1961 www-data    7u  sock                0,8      0t0 243839078 can't identify protocol
uwsgi   1961 www-data    9u  unix 0xffff8bcc64a84400      0t0 243839080 socket
(gdb) bt
#0  0x00007ff1f5ea1a20 in __write_nocancel () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x000000000042622a in uwsgi_log_verbose (fmt=0x484ba8 "SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request %.*s (ip %.*s) !!!\n") at core/logging.c:118
#2  <signal handler called>
#3  0x00007ff1f3fc82a7 in writev () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00000000004274ec in uwsgi_logit_simple (wsgi_req=0x7ff1f224f078) at core/logging.c:735
#5  0x000000000041a8ef in uwsgi_close_request (wsgi_req=0x7ff1f224f078) at core/utils.c:1093
#6  0x000000000045f7cd in simple_loop_run (arg1=0x0) at core/loop.c:149
#7  0x0000000000463c78 in uwsgi_ignition () at core/uwsgi.c:3464
#8  0x0000000000466fbd in uwsgi_worker_run () at core/uwsgi.c:3395
#9  0x00000000004674de in uwsgi_run () at core/uwsgi.c:3306
#10 0x000000000041699e in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at core/uwsgi.c:1964

Restarted uwsgi-labspuppetbackend.service

root@quarry-worker-01:~# time puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for quarry-worker-01.quarry.eqiad.wmflabs
Notice: /Stage[main]/Base::Environment/Tidy[/var/tmp/core]: Tidying 0 files
Info: Applying configuration version '1550740209'
Notice: Applied catalog in 4.45 seconds

real	0m17.112s
user	0m6.940s
sys	0m1.092s

@Andrew this might be related to the slowness you saw yesterday.

GTirloni closed this task as Resolved.Feb 21 2019, 9:12 AM
GTirloni triaged this task as Unbreak Now! priority.
Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptFeb 21 2019, 9:12 AM
This comment was removed by aborrero.

Removing my last comment. I overlooked the FQDN.

Whatever was slow yesterday is still slow today. The places I'm seeing trouble are VMs using the tools puppetmaster (although they still use the enc on labspuppetmaster)