Issue
I want to use the Docker image with Apache Spark on Ubuntu 18.04.
The more popular image from the hub has Spark 1.6. The second image has a more recent version Spark 2.2
No image has numpy installed. The basic examples for Spark MLlib main guide require it.
I've tried running Dockerfile for installing numpy unsuccessfully, adding this to the original Dockerfile for Spark 2.2 image:
RUN apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose
How do you set the container to use the OS's numpy installation? What is the procedure? Is this the correct direction at all?
Edit: OS is Ubuntu 18.04
Solution
Dockerfile:
FROM p7hb/docker-spark
RUN apt-get update && apt install -y python-numpy
Build command:
docker build -t my_image .
Run container:
docker run -it --rm my_image /bin/bash
Check numpy:
root@55ce4c59122c:~# python
Python 2.7.13 (default, Jan 19 2017, 14:48:08)
[GCC 6.3.0 20170118] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> print(numpy.__version__)
1.12.1
Answered By - atline
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.