There is a bug in spark 2.3.0, SPARK-23729, which breaks the mjolnir deployment. There is a workaround deployed, but it would be nice to bump the minor version so we don't need workarounds.
I have verified by running 2.3.1 on the cluster that the bug we are experiencing is indeed fixed. The simplest repro:
- Create a zip file with anything in it named mjolnir_venv.zip
- Run pyspark2 --master yarn --archives 'mjolnir_venv.zip#venv'
- In the python shell run:
import subprocess print sc.parallelize([1]).map(lambda x: subprocess.check_output(['ls', '-l']).collect()[0]
On 2.3.0 this prints the following. In particular notice that we have a mjolnir_venv.zip symlink here, rather than the requested rename to venv
total 44 [462/1395] -rw-r--r-- 1 yarn yarn 102 Jul 30 22:03 container_tokens -rwx------ 1 yarn yarn 732 Jul 30 22:03 default_container_executor_session.sh -rwx------ 1 yarn yarn 786 Jul 30 22:03 default_container_executor.sh -rwx------ 1 yarn yarn 6597 Jul 30 22:03 launch_container.sh lrwxrwxrwx 1 yarn yarn 89 Jul 30 22:03 mjolnir_venv.zip -> /var/lib/hadoop/data/j/yarn/local/usercache/ebernhardson/filecache/26129/mjolnir_venv.zip lrwxrwxrwx 1 yarn yarn 92 Jul 30 22:03 py4j-0.10.6-src.zip -> /var/lib/hadoop/data/f/yarn/local/usercache/ebernhardson/filecache/26130/py4j-0.10.6-src.zip lrwxrwxrwx 1 yarn yarn 84 Jul 30 22:03 pyspark.zip -> /var/lib/hadoop/data/g/yarn/local/usercache/ebernhardson/filecache/26131/pyspark.zip lrwxrwxrwx 1 yarn yarn 91 Jul 30 22:03 __spark_conf__ -> /var/lib/hadoop/data/h/yarn/local/usercache/ebernhardson/filecache/26132/__spark_conf__.zip lrwxrwxrwx 1 yarn yarn 67 Jul 30 22:03 __spark_libs__ -> /var/lib/hadoop/data/f/yarn/local/filecache/129/spark2-assembly.zip drwx--x--- 2 yarn yarn 4096 Jul 30 22:03 tmp
When run on 2.3.1 (and prior to 2.3.0) we get the following. Notice here the symlink was appropriately renamed to venv
-rw-r--r-- 1 yarn yarn 102 Jul 30 22:11 container_tokens -rwx------ 1 yarn yarn 732 Jul 30 22:11 default_container_executor_session.sh -rwx------ 1 yarn yarn 786 Jul 30 22:11 default_container_executor.sh -rwx------ 1 yarn yarn 6706 Jul 30 22:11 launch_container.sh lrwxrwxrwx 1 yarn yarn 92 Jul 30 22:11 py4j-0.10.7-src.zip -> /var/lib/hadoop/data/l/yarn/local/usercache/ebernhardson/filecache/22771/py4j-0.10.7-src.zip lrwxrwxrwx 1 yarn yarn 84 Jul 30 22:11 pyspark.zip -> /var/lib/hadoop/data/i/yarn/local/usercache/ebernhardson/filecache/22768/pyspark.zip lrwxrwxrwx 1 yarn yarn 91 Jul 30 22:11 __spark_conf__ -> /var/lib/hadoop/data/j/yarn/local/usercache/ebernhardson/filecache/22769/__spark_conf__.zip lrwxrwxrwx 1 yarn yarn 110 Jul 30 22:11 __spark_libs__ -> /var/lib/hadoop/data/k/yarn/local/usercache/ebernhardson/filecache/22770/__spark_libs__2999084826558035159.zip drwx--x--- 2 yarn yarn 4096 Jul 30 22:11 tmp lrwxrwxrwx 1 yarn yarn 89 Jul 30 22:11 venv -> /var/lib/hadoop/data/h/yarn/local/usercache/ebernhardson/filecache/22767/mjolnir_venv.zip