Issue
I need help because I don't know if the Jupyter notebook Kernel are usable in a Spark cluster.
In my local Spark I use this and I don't have problems.
I am using this Kernel for PySpark : https://github.com/Anchormen/pyspark-jupyter-kernels
I am using a Standalone Spark cluster with three nodes without Yarn.
Best regard.
Solution
You can connect to your spark cluster standalone using the master IP with the python kernel.
import pyspark
sc = pyspark.SparkContext(master='spark://<public-ip>:7077', appName='<your_app_name>')
References
- How to connect Jupyter Notebook to remote spark clusters
- Set up an Apache Spark cluster and integrate with Jupyter Notebook
- Deploy Application from Jupyter Lab to a Spark Standalone Cluster
Answered By - Carlos Henrique
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.