Issue
How can I wrap an open binary stream – a Python 2 file
, a Python 3 io.BufferedReader
, an io.BytesIO
– in an io.TextIOWrapper
?
I'm trying to write code that will work unchanged:
- Running on Python 2.
- Running on Python 3.
- With binary streams generated from the standard library (i.e. I can't control what type they are)
- With binary streams made to be test doubles (i.e. no file handle, can't re-open).
- Producing an
io.TextIOWrapper
that wraps the specified stream.
The io.TextIOWrapper
is needed because its API is expected by other parts of the standard library. Other file-like types exist, but don't provide the right API.
Example
Wrapping the binary stream presented as the subprocess.Popen.stdout
attribute:
import subprocess
import io
gnupg_subprocess = subprocess.Popen(
["gpg", "--version"], stdout=subprocess.PIPE)
gnupg_stdout = io.TextIOWrapper(gnupg_subprocess.stdout, encoding="utf-8")
In unit tests, the stream is replaced with an io.BytesIO
instance to control its content without touching any subprocesses or filesystems.
gnupg_subprocess.stdout = io.BytesIO("Lorem ipsum".encode("utf-8"))
That works fine on the streams created by Python 3's standard library. The same code, though, fails on streams generated by Python 2:
[Python 2]
>>> type(gnupg_subprocess.stdout)
<type 'file'>
>>> gnupg_stdout = io.TextIOWrapper(gnupg_subprocess.stdout, encoding="utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'file' object has no attribute 'readable'
Not a solution: Special treatment for file
An obvious response is to have a branch in the code which tests whether the stream actually is a Python 2 file
object, and handle that differently from io.*
objects.
That's not an option for well-tested code, because it makes a branch that unit tests – which, in order to run as fast as possible, must not create any real filesystem objects – can't exercise.
The unit tests will be providing test doubles, not real file
objects. So creating a branch which won't be exercised by those test doubles is defeating the test suite.
Not a solution: io.open
Some respondents suggest re-opening (e.g. with io.open
) the underlying file handle:
gnupg_stdout = io.open(
gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
That works on both Python 3 and Python 2:
[Python 3]
>>> type(gnupg_subprocess.stdout)
<class '_io.BufferedReader'>
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
>>> type(gnupg_stdout)
<class '_io.TextIOWrapper'>
[Python 2]
>>> type(gnupg_subprocess.stdout)
<type 'file'>
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
>>> type(gnupg_stdout)
<type '_io.TextIOWrapper'>
But of course it relies on re-opening a real file from its file handle. So it fails in unit tests when the test double is an io.BytesIO
instance:
>>> gnupg_subprocess.stdout = io.BytesIO("Lorem ipsum".encode("utf-8"))
>>> type(gnupg_subprocess.stdout)
<type '_io.BytesIO'>
>>> gnupg_stdout = io.open(gnupg_subprocess.stdout.fileno(), mode='r', encoding="utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
io.UnsupportedOperation: fileno
Not a solution: codecs.getreader
The standard library also has the codecs
module, which provides wrapper features:
import codecs
gnupg_stdout = codecs.getreader("utf-8")(gnupg_subprocess.stdout)
That's good because it doesn't attempt to re-open the stream. But it fails to provide the io.TextIOWrapper
API. Specifically, it doesn't inherit io.IOBase
and doesn't have the encoding
attribute:
>>> type(gnupg_subprocess.stdout)
<type 'file'>
>>> gnupg_stdout = codecs.getreader("utf-8")(gnupg_subprocess.stdout)
>>> type(gnupg_stdout)
<type 'instance'>
>>> isinstance(gnupg_stdout, io.IOBase)
False
>>> gnupg_stdout.encoding
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/codecs.py", line 643, in __getattr__
return getattr(self.stream, name)
AttributeError: '_io.BytesIO' object has no attribute 'encoding'
So codecs
doesn't provide objects which substitute for io.TextIOWrapper
.
What to do?
So how can I write code that works for both Python 2 and Python 3, with both the test doubles and the real objects, which wraps an io.TextIOWrapper
around the already-open byte stream?
Solution
Based on multiple suggestions in various forums, and experimenting with the standard library to meet the criteria, my current conclusion is this can't be done with the library and types as we currently have them.
Answered By - bignose
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.