Page MenuHomePhabricator

ElasticSearch 2.3.5 + plugins 2.3.5 raise jar hell
Closed, ResolvedPublic

Description

This is merely a support question.

I have set up a labs instance logstash.integration.eqiad.wmflabs using Jessie. It comes with Java 7 and ElasticSearch 2.3.5. Plugins are loaded from a clone of our plugins. On start it raises some jar hell:

failed to load bundle [
 file:/srv/deployment/elasticsearch/plugins/analysis-icu/lucene-analyzers-icu-5.5.0.jar,
 file:/srv/deployment/elasticsearch/plugins/analysis-icu/icu4j-54.1.jar,
 file:/srv/deployment/elasticsearch/plugins/analysis-icu/analysis-icu-2.3.5.jar
] due to jar hell
elasticsearch[5744]: Likely root cause: java.util.zip.ZipException: error in opening zip file
elasticsearch[5744]: at java.util.zip.ZipFile.open(Native Method)

SOLUTION

cd /srv/deployment/elasticsearch/plugins
git fat init && git fat pull

Bunch of random traces/infos:

systemd[1]: Starting Elasticsearch...
systemd[1]: Started Elasticsearch.
elasticsearch[5744]: [2016-11-19 20:51:26,346][WARN ][bootstrap                ] Unable to lock JVM Memory: error=12,reason=Cannot allocate memory
elasticsearch[5744]: [2016-11-19 20:51:26,347][WARN ][bootstrap                ] This can result in part of the JVM being swapped out.
elasticsearch[5744]: [2016-11-19 20:51:26,347][WARN ][bootstrap                ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: 65536 
elasticsearch[5744]: [2016-11-19 20:51:26,347][WARN ][bootstrap                ] These can be adjusted by modifying /etc/security/limits.conf, for example:
elasticsearch[5744]: # allow user 'elasticsearch' mlockall 
elasticsearch[5744]: elasticsearch soft memlock unlimited
elasticsearch[5744]: elasticsearch hard memlock unlimited
elasticsearch[5744]: [2016-11-19 20:51:26,347][WARN ][bootstrap                ] If you are logged in interactively, you will have to re-login for the new limits to take effect.
elasticsearch[5744]: [2016-11-19 20:51:26,628][INFO ][node                     ] [logstash] version[2.3.5], pid[5744], build[90f439f/2016-07-27T10:36:52Z]
elasticsearch[5744]: [2016-11-19 20:51:26,628][INFO ][node                     ] [logstash] initializing ...
elasticsearch[5744]: Exception in thread "main" java.lang.IllegalStateException: failed to load bundle [file:/srv/deployment/elasticsearch/plugins/analysis-icu/lucene-analyzers-icu-5.5.0.jar, file:/srv/deployment/elasticsearch/plugins/analysis-icu/icu4j-54.1.jar, file:/srv/deployment/elasticsearch/plugins/analysis-icu/analysis-icu-2.3.5.jar] due to jar hell
elasticsearch[5744]: Likely root cause: java.util.zip.ZipException: error in opening zip file
elasticsearch[5744]: at java.util.zip.ZipFile.open(Native Method)
elasticsearch[5744]: at java.util.zip.ZipFile.<init>(ZipFile.java:215)
elasticsearch[5744]: at java.util.zip.ZipFile.<init>(ZipFile.java:145)
elasticsearch[5744]: at java.util.jar.JarFile.<init>(JarFile.java:154)
elasticsearch[5744]: at java.util.jar.JarFile.<init>(JarFile.java:91)
elasticsearch[5744]: at org.elasticsearch.bootstrap.JarHell.checkJarHell(JarHell.java:174)
elasticsearch[5744]: at org.elasticsearch.plugins.PluginsService.loadBundles(PluginsService.java:419)
elasticsearch[5744]: at org.elasticsearch.plugins.PluginsService.<init>(PluginsService.java:129)
elasticsearch[5744]: at org.elasticsearch.node.Node.<init>(Node.java:158)
elasticsearch[5744]: at org.elasticsearch.node.Node.<init>(Node.java:140)
elasticsearch[5744]: at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
elasticsearch[5744]: at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:178)
elasticsearch[5744]: at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270)
elasticsearch[5744]: at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
elasticsearch[5744]: Refer to the log for complete error details.
systemd[1]: elasticsearch.service: main process exited, code=exited, status=1/FAILURE
systemd[1]: Unit elasticsearch.service entered failed state.
# apt-cache policy elasticsearch
elasticsearch:
  Installed: 2.3.5
  Candidate: 2.3.5
  Version table:
 *** 2.3.5 0
       1001 http://apt.wikimedia.org/wikimedia/ jessie-wikimedia/thirdparty amd64 Packages
        100 /var/lib/dpkg/status
     1.6.2+dfsg-1~bpo8+1 0
        100 http://mirrors.wikimedia.org/debian/ jessie-backports/main amd64 Packages
     1.0.3+dfsg-5+deb8u1 0
        500 http://security.debian.org/ jessie/updates/main amd64 Packages

https://gerrit.wikimedia.org/r/operations/software/elasticsearch/plugins is at c5de449cf10e5935cae7e58276c86abb4af1b351 "Upgrade elasticsearch plugins to 2.3.5 - T145404".

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Hijacking /usr/share/elasticsearch/bin/elasticsearch to use strace with:

exec /usr/bin/strace -f -y -o /tmp/hashar.log -e file "$JAVA"  ...
...
12939 stat("/usr/share/elasticsearch/lib/commons-cli-1.3.1.jar", {st_mode=S_IFREG|0644, st_size=52988, ...}) = 0
12939 stat("/usr/share/elasticsearch/lib/jna-4.1.0.jar", {st_mode=S_IFREG|0644, st_size=914597, ...}) = 0
12939 stat("/usr/share/elasticsearch/lib/lucene-analyzers-common-5.5.0.jar", {st_mode=S_IFREG|0644, st_size=1576967, ...}) = 0
12939 stat("/usr/share/elasticsearch/lib/lucene-suggest-5.5.0.jar", {st_mode=S_IFREG|0644, st_size=246644, ...}) = 0
12939 stat("/srv/deployment/elasticsearch/plugins/analysis-icu/lucene-analyzers-icu-5.5.0.jar", {st_mode=S_IFREG|0644, st_size=74, ...}) = 0
12939 open("/srv/deployment/elasticsearch/plugins/analysis-icu/lucene-analyzers-icu-5.5.0.jar", O_RDONLY) = 64</srv/deployment/elasticsearch/plugins/analysis-icu/lucene-analyzers-icu-5.5.0.jar>
12960 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x14} ---
12960 lstat("/var/run/elasticsearch/elasticsearch.pid", {st_mode=S_IFREG|0644, st_size=5, ...}) = 0
12960 unlink("/var/run/elasticsearch/elasticsearch.pid") = 0

Maybe lucene-analyzers-icu-5.5.0.jar is to blame ;D

hashar claimed this task.

And obviously the .jar files are NOT jar files:

tail /srv/deployment/elasticsearch/plugins/analysis-icu/*.jar
==> /srv/deployment/elasticsearch/plugins/analysis-icu/analysis-icu-2.3.5.jar <==
#$# git-fat e0d736546f27a158dc9e092a6ea99badb1d0abb0                24489

==> /srv/deployment/elasticsearch/plugins/analysis-icu/icu4j-54.1.jar <==
#$# git-fat 3f66ecd5871467598bc81662817b80612a0a907f             11126867

==> /srv/deployment/elasticsearch/plugins/analysis-icu/lucene-analyzers-icu-5.5.0.jar <==
#$# git-fat 69a6e72d322b6643f1b419e6c9cc46623a2404e9                78907

Went with git fat init && git fat pull:

receiving file list ... 
0 files to consider

sent 13 bytes  received 13 bytes  52.00 bytes/sec
total size is 0  speedup is 0.00
Restoring e0d736546f27a158dc9e092a6ea99badb1d0abb0 -> analysis-icu/analysis-icu-2.3.5.jar
Restoring 3f66ecd5871467598bc81662817b80612a0a907f -> analysis-icu/icu4j-54.1.jar
Restoring 69a6e72d322b6643f1b419e6c9cc46623a2404e9 -> analysis-icu/lucene-analyzers-icu-5.5.0.jar
Restoring e06ac6385e3b9b67268b5ed34b1c14227bfc94f7 -> experimental-highlighter-elasticsearch-plugin/experimental-highlighter-core-2.3.5.jar
Restoring 8bcdf614fb239ca2b80fbea5cc3c98a687dd55bc -> experimental-highlighter-elasticsearch-plugin/experimental-highlighter-elasticsearch-plugin-2.3.5.jar
Restoring 313f9704f12e62318950424efbb39a51416c33ed -> experimental-highlighter-elasticsearch-plugin/experimental-highlighter-lucene-2.3.5.jar
Restoring 5802b52906c98680e13515c2edf325831fc9ce9f -> extra/extra-2.3.5.jar
Restoring b7f0fc8f61ecadeb3695f0b9464755eee44374d4 -> swift-repository-plugin/commons-codec-1.6.jar
Restoring cd8d6ffc833cc63c30d712a180f4663d8f55799b -> swift-repository-plugin/commons-io-2.3.jar
Restoring 0ce1edb914c94ebc388f086c6827e8bdeec71ac2 -> swift-repository-plugin/commons-lang-2.6.jar
Restoring b69bd03af60bf487b3ae1209a644ecac587bf6fc -> swift-repository-plugin/httpclient-4.2.1.jar
Restoring 2d503272bf0a8b5f92d64db78b4ba9abbaccc6fd -> swift-repository-plugin/httpcore-4.2.1.jar
Restoring 2dd41e7570f5c73e63a6a1311671a60c817e1989 -> swift-repository-plugin/jackson-core-asl-1.9.7.jar
Restoring 3bc2efad5ceb9e24e44f731d4282b5df3ea6d23f -> swift-repository-plugin/jackson-mapper-asl-1.9.7.jar
Restoring 358c500a1262d77e87167cbd0fdfb3ae8eca4fca -> swift-repository-plugin/jcl-over-slf4j-1.7.2.jar
Restoring 5fdc087b7025b7627fc2e5c14bb78f8d7aa775fa -> swift-repository-plugin/joss-0.9.12.jar
Restoring 5af35056b4d257e4b64b9e8069c0746e8b08629f -> swift-repository-plugin/log4j-1.2.17.jar
Restoring 0081d61b7f33ebeab314e07de0cc596f8e858d97 -> swift-repository-plugin/slf4j-api-1.7.2.jar
Restoring 7c9f26282f859191956b73ea78b0c992f1d7769a -> swift-repository-plugin/swift-repository-plugin-2.3.5.jar

Works

I think we use git-deploy to deploy plugins on deployment-prep (via deployment-tin.deployment-prep.eqiad.wmflabs) and production (via deployment.eqiad.wmnet).
I don't know if we could simply add new minion to deployment-tin.deployment-prep.eqiad.wmflabs so that plugins would be deployed everywhere.
But maybe that's a totally different infra and we don't want mixup integration with deployment-prep.

git-deploy is my understanding of how the plugins are deployed on beta cluster.

integration is a different project indeed and I would rather not mix them up. I guess my error is that I have git clone the projects directly instead of having git-deploy/Trebuchet to populate them, which I assume triggers a run of git-fat.