Issue
I'd like to capture the output of a long running, but verbose script (pytest at the moment) and print the progress on a webpage. I'm facing two problems here
First: I need to asynchronously run the script and capture the output while it's still running. I found that pytest has the --capture=tee-sys
flag, that i suppose is for this purpose, but I couldn't find out how that works.
A more general approach would be to use subprocess.Popen
that seems to handle the asynchronous part already and output can be piped. However if I capture with Popen.communicat()
I either block the thread until the process is finished and I can't live update, or I have to continuously catch the TimeoutExpired
exception and re-initiate communication, what feels like a workaround, but would probably work. But is there an easier way to achieve this?
Second: I need to update the frontend without receiving a request. At first I even thought having a terminal style output would be cool and found stuff like ttyd, but that seems overkill and, if I'm not mistaken, would allow a user to input into the terminal, what I don't want. Now I also found this answer, that suggests using an iframe and the stream_template_context()
in the flask documentation, however if I'm not mistaken in both cases it seems like I would need to build a valid html page on the run with the collected data from the subprocess and that feels prone to errors. I also found flask-socketio, what I'd like to avoid, because the the falsk server is supposed to run on a custom linux and I'd need to add this module first. That should be doable in the end, but if there is an easier way I'd prefer that. Polling from a javascript in the frontend is an option I considered, but that looks like a workaround again. What would best practice here?
For both parts I'm happy about more resources to read about best practices etc.
after a bit more research I've got a minimal working example like this:
from flask import Flask, Response, render_template
from subprocess import Popen, PIPE
app = Flask(__name__)
@app.route('/content')
def content():
# start subprocess
def inner():
# proc = Popen(['/usr/local/bin/pytest', 'test1.py'], stdout=PIPE)
with Popen(['/usr/local/bin/pytest', 'test1.py'], stdout=PIPE) as proc:
while proc.poll() is None:
line = proc.stdout.readline()
# yield line.decode() + '<br/>\n'
yield str(line) + '<br/>\n'
return Response(inner(), mimetype='text/html')
@app.route('/')
def index():
return render_template('test.html')
if __name__ == '__main__':
app.run(debug=True)
Where test.py
just defines a series of test functions that each sleep
a second and assert True
just to get a desired sample output,
and a test.html
:
<!doctype html>
<head>
<title>Title</title>
</head>
<body>
<div>
<iframe frameborder="1"
width="52%"
height="500px"
style='background: transparent;'
src="{{ url_for('content')}}">
</iframe>
</div>
</body>
However, the subprocess documentation states, that using proc.stdout.readline()
like this is discouraged, as it can deadlock the child process.
Moreover, I sometimes get a few, sometimes a lot of empty lines after the child process terminates. I use the str(line)
to make them visible, as the b''
from the bytes-string is there.
Any ideas on where to go from here?
Solution
I have found two solutions to this, one I really like, so I'm going to share.
But first: I think I misunderstood the warning about not using Popen.stdout.read()
in the subprocess documentation. I only use the readline
I should be fine, as the call should return in any case. (Please correct me if I'm wrong here, as the whole answer is based upon the assumption, that readline
will not deadlock where read
would.)
First solution: modifying this answer.
The <iframe>
isn't actually needed, Flask can stream data into a <div>
. The html looks like this:
<p>Test Output</p>
<div id="output">
{% if data %}
{% for item in data %}
{{ item }}<br />
{% endfor %}
{% endif %}
</div>
The {% if data %}
is optional. I used it to be able to load the page with an empty <div>
and on a button press reload the site and provide data
to fill the <div>
The python part looks like this:
from flask import Response, stream_template
from subproces import Popen, PIPE
@app.route('/')
def stream() -> Response:
"""stream data to template"""
def update(task: str | list[str]):
"""update the stream data"""
with Popen(task, stdout=PIPE, stderr=PIPE) as proc:
while proc.poll() is None:
line = proc.stdout.readline()
log.info(line.decode())
yield line.decode()
while (line := proc.stdout.readline()) != b'':
log.info(line.decode())
yield line.decode()
err = proc.stderr.read() # risk of deadlock
# could be replaced by while readline as above
if err != b'':
log.warning("additionional stderr produced while running tests: %s", err.decode())
return Response(
stream_template('index.html.jinja',
data=update(['/usr/local/bin/pytest', 'dummytest.py'])))
I had to add the 2nd while loop after proc.poll()
returned because I kept missing the last lines of data. But I would rather not use the 2nd while loop alone because that would risk breaking if a line was empty (shouldn't happen because the newline char should always be there).
The 2nd solution was basically this answer utilizing the EventSource
. I was hesitant to try this as I'm not as fluent in javascript, but in the end this worked like a charm. I added an event handler on a button, that prevented the forms post
to the server, but manually opened the EventSource. The html looks like this:
<div id="output"></div>
<p><form action="/action" method="post" role="form" id="testsForm">
<button class="btn btn-primary" type="submit">Run Test</button>
</form></p>
<script type="text/javascript">
(() => {
'use strict'
const form = document.getElementById("testsForm")
form.addEventListener('submit', event => {
event.preventDefault()
var target = document.getElementById("output")
var update = new EventSource("/action/stream")
update.onmessage = function(e) {
if (e.data == "close") {
update.close();
} else {
target.innerHTML += (e.data + '<br/>');
}
};
}, false)
})()
</script>
The action
and method
in the form might be completely optional because the form never propagates to the backend. (This is using bootstrap classes, if someone wants to recreate this)
The python for this looks like this:
from flask import Response
from subproces import Popen, PIPE
@app.route("/action/stream")
def stream() -> Response:
"""open event stream to client"""
def update(task: str | list[str]):
"""update the event stream data"""
with Popen(task, stdout=PIPE, stderr=PIPE) as proc:
log.info("opened subprocess for async task communication")
while proc.poll() is None:
line = proc.stdout.readline()
log.info(line.decode().rstrip())
# the '\n\n' is needed for termination in th frontend
# the stream handling (especially closing) will break without it
yield 'data: ' + line.decode() + "\n\n"
while (line := proc.stdout.readline()) != b'':
log.info(line.decode())
yield 'data: ' + line.decode() + "\n\n"
err = proc.stderr.read() # risk of deadlock
if err != b'':
log.warning("additionional stderr produced while running script: %s", err.decode())
yield "data: close\n\n"
return Response(update(['/usr/local/bin/pytest', 'dummytest.py']), mimetype="text/event-stream")
The big plus for this solution is, that the website does not need to reload to stream the data. I have an empty box and when the button is pressed, the output of the script will appear on the screen. (I even found some styling to make it look like an old school monitor)
In terms of longer communication: I tested both versions with a 10-minute timeout in one test in the pytest script, and both worked without problem.
There might even be another solution to this, but I never tried this as I don't quite understand it. I might be in the future, if the 2nd solution is not enough for some reason.
Answered By - JuA
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.