Issue
Can I use Hadoop & MapReduce in Jupyter/IPython? Is there something similar to what PySpark for Spark is?
Solution
Of course you can. Many Frameworks like Hadoop Streaming, mrjob and dumbo to name a few. The techical aspect of including these in Jupyter should concist of either subprocess.Popen()
calls or typical python imports, depending on the framework.
A nice overview/critique of some of these Frameworks can be found in this cloudera blogpost.
Answered By - Dimitris Fasarakis Hilliard
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.