Issue
I am running EMR cluster(AWS) but I do not understand how notebook imports packages. I am running PySpark kernel.
import boto3
No module named 'boto3'
Traceback (most recent call last):
ModuleNotFoundError: No module named 'boto3'
print (sys.version) shows
3.7.6 (default, Feb 26 2020, 20:54:15)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-6)]
print(sys.executable) shows
/tmp/1594625399736-0/bin/python
I have both Conda and pip3 install of boto3.
How to solve this?
Solution
Are you using pyspark? If yes, then you need to install the packages in the spark context. Refer to this AWS document: https://aws.amazon.com/blogs/big-data/install-python-libraries-on-a-running-cluster-with-emr-notebooks/
similarly install any dependency packages if you see module not found error on import. Make sure the versions are compatible.
Answered By - Bhrigu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.