Issue
I am puzzled that defining __str__
for a class seems to have no effect on using the str
function on a class instance. For example, I read in the Django documentation that:
The
str
built-in call__str__()
to determine the human-readable representation of an object.
But that doesn't appear to be true. Here's an example from a module where text
is always assumed to be unicode:
import six
class Test(object):
def __init__(self, text):
self._text = text
def __str__(self):
if six.PY3:
return str(self._text)
else:
return unicode(self._text)
def __unicode__(self):
if six.PY3:
return str(self._text)
else:
return unicode(self._text)
In Python 2, it gives the following behavior:
>>> a=Test(u'café')
>>> print a.__str__()
café
>>> print a # same error with str(a)
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-63-202e444820fd> in <module>()
----> 1 str(a)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
Is there a way to overload the str
function?
Solution
For Python 2, you are returning the wrong type from the __str__
method. You are returning unicode
, while you must return str
:
def __str__(self):
if six.PY3:
return str(self._text)
else:
return self._text.encode('utf8')
Because self._text
is not already of type str
, you'll need to encode it. Because you returned Unicode instead, Python is forced to encode it first, but the default ASCII encoding can't handle the non-ASCII é
character.
Printing the object results in the right output only because my terminal is configured to handle UTF-8:
>>> a = Test(u'café')
>>> str(a)
'caf\xc3\xa9'
>>> print a
café
>>> unicode(a)
u'caf\xe9'
Note that there is no __unicode__
method in Python 3; your if six.PY3
in that method is entirely redundant. The following would work too:
class Test(object):
def __init__(self, text):
self._text = text
def __str__(self):
if six.PY3:
return self._text
else:
return self._text.encode('utf8')
def __unicode__(self):
return self._text
However, if you are using the six
library, you'd be far better of using the @six.python_2_unicode_compatible
decorator, and only define a Python 3 version for the __str__
method:
@six.python_2_unicode_compatible
class Test(object):
def __init__(self, text):
self._text = text
def __str__(self):
return self._text
where it is assumed text
is always Unicode. If you are working with Django, then you can get the same decorator from the django.utils.encoding
module.
Answered By - Martijn Pieters
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.