Page MenuHomePhabricator

Spark 2 as cluster default (working with oozie)
Closed, ResolvedPublic13 Estimated Story Points

Description

Here is the plan I suggest to globally move production to Spark 2:

@Ottomata : Comments welcome!

Event Timeline

Can we do it? How hard is it?

Nuria moved this task from Incoming to Wikistats on the Analytics board.

Need a place to park some notes:

spark build your own deb

export JAVA_HOME
- add openjdk into build depends
- export http proxies
- alter .m2/settings.xml to add proxy

<settings>
<proxies>
   <proxy>
      <id>http-wikimedia</id>
      <active>true</active>
      <protocol>http</protocol>
      <host>webproxy.eqiad.wmnet</host>
      <port>8080</port>
    </proxy>
    <proxy>
       <id>https-wikimedia</id>
       <active>true</active>
       <protocol>https</protocol>
       <host>webproxy.eqiad.wmnet</host>
       <port>8080</port>
     </proxy>
  </proxies>

</settings>

add --settings=/path/to/settings.xml to mvn command (BUILD_OPTS in  debian/do-component-build) ???





$MVN  help:evaluate -Dexpression=project.version --skip-java-test -Dcdh.build=true -Divy.home=/tmp/buildd/.ivy2 -Dsbt.ivy.home=/tmp/buildd/.ivy2 -Duser.home=/tmp/buildd -Drepo.maven.org= -Dreactor.repo=file:///tmp/buildd/.m2/repository -DskipTests -DrecompileMode=all --settings=./debian/m2-settings.xml

make-distribution.sh
 --skip-java-test and --with-tachyon --tgz need to be shifted???
 
 need to add -Pyarn to build with yarn deps???
 
 
 -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver ???
JAllemandou renamed this task from Spike: Spark 2.x as cluster default (working with oozie) to Spark 2.x as cluster default (working with oozie).Feb 22 2018, 10:24 AM
JAllemandou renamed this task from Spark 2.x as cluster default (working with oozie) to Spark 2.2.1 as cluster default (working with oozie).Feb 22 2018, 10:25 AM
JAllemandou claimed this task.
JAllemandou edited projects, added Analytics-Kanban; removed Analytics.
JAllemandou moved this task from Next Up to In Progress on the Analytics-Kanban board.
JAllemandou set the point value for this task to 8.
JAllemandou changed the point value for this task from 8 to 13.

Change 415465 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Automate installation of spark2 oozie sharelib

https://gerrit.wikimedia.org/r/415465

Change 415584 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Temporarliy run banner impression spark streaming job from 2.2.1 .jar

https://gerrit.wikimedia.org/r/415584

Change 415584 merged by Ottomata:
[operations/puppet@production] Temporarliy run banner impression spark streaming job from 2.2.1 .jar

https://gerrit.wikimedia.org/r/415584

Change 415465 merged by Ottomata:
[operations/puppet@production] Automate installation of spark2 oozie sharelib

https://gerrit.wikimedia.org/r/415465

Change 415602 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Properly install spark2_oozie_sharelib_install.sh

https://gerrit.wikimedia.org/r/415602

Change 415602 merged by Ottomata:
[operations/puppet@production] Properly install spark2_oozie_sharelib_install.sh

https://gerrit.wikimedia.org/r/415602

@joal, spark2.2.1 sharelib exists. Let me know if I can remove our test spark2_test0 one.

Change 415634 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Also copy in hive-site.xml to spark2 oozie sharelib

https://gerrit.wikimedia.org/r/415634

Change 415634 merged by Ottomata:
[operations/puppet@production] Also copy in hive-site.xml to spark2 oozie sharelib

https://gerrit.wikimedia.org/r/415634

@mforns FYI, we'd like to get your Sanitize job merged before we proceed with this...and we're hoping we can do this next week! :D

@Ottomata : I tested the spark2.2.1 sharelib, and it failed. I think this is the issue:

hdfs dfs -ls /user/oozie/share/lib/lib_20170228165236/spark2.2.1 | grep oozie-

hdfs dfs -ls /user/oozie/share/lib/lib_20170228165236/spark2_test0 | grep oozie-
-rw-r--r--   3 oozie hadoop      26013 2018-02-22 18:57 /user/oozie/share/lib/lib_20170228165236/spark2_test0/oozie-sharelib-spark-4.1.0-cdh5.10.0.jar
-rw-r--r--   3 oozie hadoop      26013 2018-02-22 18:57 /user/oozie/share/lib/lib_20170228165236/spark2_test0/oozie-sharelib-spark.jar

Change 415812 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Update spark jobs to use hive context

https://gerrit.wikimedia.org/r/415812

Change 416713 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Fix spark2 oozie sharelib install command

https://gerrit.wikimedia.org/r/416713

Oof, joal, ya, copied spark-assembly instead of oozie-sharelib. Fixed now.

Change 416713 merged by Ottomata:
[operations/puppet@production] Fix spark2 oozie sharelib install command

https://gerrit.wikimedia.org/r/416713

Ottomata renamed this task from Spark 2.2.1 as cluster default (working with oozie) to Spark 2 as cluster default (working with oozie).Apr 5 2018, 4:26 PM
Ottomata updated the task description. (Show Details)

Change 424380 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/debs/spark2@debian] 2.3.0 Hadoop 2.6 release

https://gerrit.wikimedia.org/r/424380

Change 424380 merged by Ottomata:
[operations/debs/spark2@debian] 2.3.0 Hadoop 2.6 release

https://gerrit.wikimedia.org/r/424380

Spark 2.3 installed fleet wide.

I also updated the spark2 spark-assemply.zip file and added a new oozie sharelib spark2.3.0:

sudo -u spark hdfs dfs -put /usr/lib/spark2/spark2-assembly.zip hdfs:///user/spark/share/lib/spark2-assembly.zip
sudo -u oozie oozie admin -oozie $OOZIE_URL -shareliblist | grep spark
spark
spark2_test0
spark2.2.1
spark2.3.0

Change 424444 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/debs/spark2@debian] Install spark2-thriftserver executable

https://gerrit.wikimedia.org/r/424444

Change 424444 merged by Ottomata:
[operations/debs/spark2@debian] Install spark2-thriftserver executable

https://gerrit.wikimedia.org/r/424444

Change 424593 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Install the Spark 2 yarn shuffle service jar over Spark 1's

https://gerrit.wikimedia.org/r/424593

Change 425084 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Add HiveServer to spark-refine for schema changes

https://gerrit.wikimedia.org/r/425084

Change 425289 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use spark2 for Refine job

https://gerrit.wikimedia.org/r/425289

Change 425306 had a related patch set uploaded (by Ottomata; owner: Joal):
[analytics/refinery/source@master] Add HiveServer to spark-refine for schema changes

https://gerrit.wikimedia.org/r/425306

Change 425306 merged by Ottomata:
[analytics/refinery/source@master] Add HiveServer to spark-refine for schema changes

https://gerrit.wikimedia.org/r/425306

Change 425084 abandoned by Joal:
Add HiveServer to spark-refine for schema changes

Reason:
Cherry picked in another change

https://gerrit.wikimedia.org/r/425084

Change 415812 merged by Ottomata:
[analytics/refinery/source@master] Update spark jobs to use hive context

https://gerrit.wikimedia.org/r/415812

Change 424593 merged by Ottomata:
[operations/puppet@production] Install the Spark 2 yarn shuffle service jar over Spark 1's

https://gerrit.wikimedia.org/r/424593

Mentioned in SAL (#wikimedia-analytics) [2018-04-10T18:18:54Z] <ottomata> restarting all hadoop nodemanagers, 3 at a time to pick up spark2-yarn-shuffle.jar T159962

Change 425347 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] Refine - Don't call sys.exit if running in YARN

https://gerrit.wikimedia.org/r/425347

Change 425289 merged by Ottomata:
[operations/puppet@production] Use spark2 for Refine job and banner-streaming job

https://gerrit.wikimedia.org/r/425289

Change 425347 merged by Ottomata:
[analytics/refinery/source@master] Refine - Don't call sys.exit if running in YARN

https://gerrit.wikimedia.org/r/425347

Change 425578 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] DataFrameToHive - Use DataFrame .write.parquet instead of .insertInto

https://gerrit.wikimedia.org/r/425578

Change 425597 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[analytics/refinery/source@master] RefineTarget - Use Hadoop FS to infer input format rather than Spark

https://gerrit.wikimedia.org/r/425597

Change 425578 merged by Ottomata:
[analytics/refinery/source@master] DataFrameToHive - Use DataFrame .write.parquet instead of .insertInto

https://gerrit.wikimedia.org/r/425578

Change 425597 merged by Ottomata:
[analytics/refinery/source@master] RefineTarget - Use Hadoop FS to infer input format rather than Spark

https://gerrit.wikimedia.org/r/425597

Change 426943 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Point refine job at 0.0.62 jar version

https://gerrit.wikimedia.org/r/426943

Change 426943 merged by Ottomata:
[operations/puppet@production] Point refine job at 0.0.62 jar version

https://gerrit.wikimedia.org/r/426943