Issue
I've created a spark cluster on ec2, after that, I installed Jupyter on the master node and started jupyter, after that I created sparkcontext using
findspark.init(spark_home='/home/ubuntu/spark')
import pyspark
from functools import partial
sc = pyspark.SparkContext(appName="Pi")
when I am trying to run any job, spark is only utilizing cores of the master machine, all the slaves are running and connected to master, but I am still not able to use the cores of any of the slave machines, anybody please help.
Solution
You need to set the master url to spark://...
when creating your SparkContext
Answered By - bonnal-enzo
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.