Issue
I would like to have same option as mentioned in this question: How to display full output in Jupyter, not only last result? but for AWS EMR's jupyterhub's pyspark kernel (Spark 2.4.4). It works with python3 (python3.6) kernel.
It works if I use print statements, but in that case, it doesn't work if last step is failed, then it will only show result for the failed step as shown in the image below.
Also, to note, not sure if it is related, but, below code doesn't run in sync i.e. print wait print wait...., but, it just prints everything at once at the end.
import time
for i in range(0,10):
print(i)
time.sleep(2)
Just adding the question from the referred post, if in case the referred question/post gets deleted or changes.
I want Jupyter to print all the interactive output without resorting to print, not only the last result. How to do it?
Example :
a=3
a
a+1
I would like to display
3
4
Solution
The print statement output goes to the stdout
or stderr
on the computer that is running a spark executor.
Considering you have a big cluster having n workers(each storing partition of an RDD or DataFrame). Is is hard to expect the ordered output in a job (for instance map). This may be considered a design choice as well for spark itself. Where will that data be printed out? since nodes are running code in parallel, which of them will be printed first?
So, we dont have interactive print statements inside jobs. These whole thing can also remind you about why we had accumulators
and broadcast
variables.
So, I will advice you to use logs generated by steps instead and work with logs. To view logs in Amazon S3, cluster logging must be enabled (which is the default for new clusters). View Log Files Archived to Amazon S3.
For your second question about sleep()
and print
, python is line buffered, which forces it to wait for a newline before printing to stdout
. If the output is not a console, then even newline won't trigger a flush.
You can force the behaviour as
import time
for i in range(0,10):
print(i,flush=True)
time.sleep(2)
Answered By - A.B
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.