Issue
When I try to find the word's count in UTF-8 string I got the next:
UnicodeEncodeError
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)
That's what I do
tr.words_count = (str(tr.transcribe).count(' '))
I need to calculate how many words in UTF-8 text and it seems that my method won't work. Do you have any ideas? Thanks
Solution
str(tr.transcribe.decode('utf-8'))
Or better yet,
unicode(tr.transcribe).count(' ')
Or even better (to not get confused if there are multiple spaces in a row),
len(unicode(tr.transcribe).split())
Answered By - Amber
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.