I have pyspark running as so:
pyspark2 --master yarn --executor-memory 4G --executor-cores 2 --driver-memory 8G --conf spark.driver.maxResultSize=4G
The following command gives an error:
from pyspark.ml.regression import RandomForestRegressor
And the error message is:
ImportError: No module named numpy
Anything I can do to install numpy in worker machines?