Issue
[Python 2] SUB = string.maketrans("0123456789","₀₁₂₃₄₅₆₇₈₉")
this code produces the error:
ValueError: maketrans arguments must have same length
I am unsure why this occurs because the strings are the same length. My only idea is that the subscript text length is somehow different than standard size characters but I don't know how to get around this.
Solution
No, the arguments are not the same length:
>>> len("0123456789")
10
>>> len("₀₁₂₃₄₅₆₇₈₉")
30
You are trying to pass in encoded data; I used UTF-8 here, where each digit is encoded to 3 bytes each.
You cannot use str.translate()
to map ASCII bytes to UTF-8 byte sequences. Decode your string to unicode
and use the slightly different unicode.translate()
method; it takes a dictionary instead:
nummap = {ord(c): ord(t) for c, t in zip(u"0123456789", u"₀₁₂₃₄₅₆₇₈₉")}
This creates a dictionary mapping Unicode codepoints (integers), which you can then use on a Unicode string:
>>> nummap = {ord(c): ord(t) for c, t in zip(u"0123456789", u"₀₁₂₃₄₅₆₇₈₉")}
>>> u'99 bottles of beer on the wall'.translate(nummap)
u'\u2089\u2089 bottles of beer on the wall'
>>> print u'99 bottles of beer on the wall'.translate(nummap)
₉₉ bottles of beer on the wall
You can then encode the output to UTF-8 again if you so wish.
From the method documentation:
For Unicode objects, the
translate()
method does not accept the optional deletechars argument. Instead, it returns a copy of the s where all characters have been mapped through the given translation table which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings orNone
. Unmapped characters are left untouched. Characters mapped toNone
are deleted.
Answered By - Martijn Pieters
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.