On 2016-02-16 21:11:25 Nodepool started raising alarms while attempting to add a slave in Jenkins:
```
JenkinsException: Error in request.Possibly authentication failed [500]
```
The pool of node got quickly exhausted and no build could run anymore.
```
lang=irc
[22:38:56] <paladox> it seems that https://integration.wikimedia.org/zuul/ has frozen again.
[22:52:18] <paladox> It seems there is a big queue at https://integration.wikimedia.org/zuul/ because rake-jessie is not working. hashar.
[22:53:19] <legoktm> are we out of nodepool slaves?
[22:54:25] <legoktm> Feb 16 22:52:09 labnodepool1001 nodepoold[1596]: JenkinsException: Error in request.Possibly authentication failed [500]
[22:56:25] <+greg-g> hashar: ^^
[22:56:29] <legoktm> paladox: it's building more slaves as we speak, just have to wait a bit
[22:56:30] <hashar> !log contint: Nodepool instances pool exhausted
[22:56:42] <legoktm> I can see it building more slaves right now
[22:56:45] <+hashar> must be some labs issue
[22:57:11] <legoktm> hashar: journald has a bunch of exceptions, I think jenkins was returning 500 errors to nodepool?
[22:57:47] <+hashar> looking at /var/log/nodepool/nodepool.log on labnodepool1001.eqiad.wmnet
[22:58:11] <+hashar> yeah apparently Nodepool could not authenticate with Jenkins
[22:58:32] <+hashar> first event on 21:18 UTC
[23:01:16] <+hashar> so why the hell does nodepool cant authenticate with Jenkins
[23:02:49] <+hashar> !log Nodepool can not authenticate with Jenkins anymore. Thus it can not add slaves it spawned.
[23:11:27] <+hashar> I am gonna nuke Jenkins
[23:14:37] <+hashar> !log Jenkins: Could not create rootDir /var/lib/jenkins/config-history/nodes/ci-jessie-wikimedia-34969/2016-02-16_22-40-23
[23:14:46] <+hashar> CAUSE THERE IS ONLY 32K INODES PER DIR !!!!!!!!!!!!!
[23:15:07] <+hashar> found via https://integration.wikimedia.org/ci/log/Warnings/
[23:17:13] <+hashar> !log Jenkins accepting slave creations again. Root cause is /var/lib/jenkins/config-history/nodes/ has reached the 32k inode limit.
[23:17:40] <+hashar> 2016-02-16 23:16:40,691 INFO nodepool.NodeLauncher: Node id: 35052 added to jenkins
[23:18:16] <+hashar> !log jenkins@gallium find /var/lib/jenkins/config-history/nodes -maxdepth 1 -type d -name 'ci-jessie*' -exec rm -vfR {} \;
```
The Jenkins master has plugin that keep an history of config changes and that includes slaves. When we have reached 32k + entries in the directory `/var/lib/jenkins/config-history/nodes/` it reached 32k inodes and the file system refused to save. That prevents Jenkins from adding the slave.
**Actions**
[ ] Update debug doc to hint at https://integration.wikimedia.org/ci/log/Warnings/
[ ] Nodepool should poll Jenkins instead of discarding the instance, rebuilding a new one and failing
[ ] Get rid of Jenkins configuration plugin for slaves or garbage collect old configurations - T126552