Issue
I have an array of floats that I have normalised to one (i.e. the largest number in the array is 1), and I wanted to use it as colour indices for a graph. In using matplotlib to use grayscale, this requires using strings between 0 and 1, so I wanted to convert the array of floats to an array of strings. I was attempting to do this by using "astype('str')", but this appears to create some values that are not the same (or even close) to the originals.
I notice this because matplotlib complains about finding the number 8 in the array, which is odd as it was normalised to one!
In short, I have an array phis, of float64, such that:
numpy.where(phis.astype('str').astype('float64') != phis)
is non empty. This is puzzling as (hopefully naively) it appears to be a bug in numpy, is there anything that I could have done wrong to cause this?
Edit: after investigation this appears to be due to the way the string function handles high precision floats. Using a vectorized toString function (as from robbles answer), this is also the case, however if the lambda function is:
lambda x: "%.2f" % x
Then the graphing works - curiouser and curiouser. (Obviously the arrays are no longer equal however!)
Solution
You seem a bit confused as to how numpy arrays work behind the scenes. Each item in an array must be the same size.
The string representation of a float doesn't work this way. For example, repr(1.3)
yields '1.3'
, but repr(1.33)
yields '1.3300000000000001'
.
A accurate string representation of a floating point number produces a variable length string.
Because numpy arrays consist of elements that are all the same size, numpy requires you to specify the length of the strings within the array when you're using string arrays.
If you use x.astype('str')
, it will always convert things to an array of strings of length 1.
For example, using x = np.array(1.344566)
, x.astype('str')
yields '1'
!
You need to be more explict and use the '|Sx'
dtype syntax, where x
is the length of the string for each element of the array.
For example, use x.astype('|S10')
to convert the array to strings of length 10.
Even better, just avoid using numpy arrays of strings altogether. It's usually a bad idea, and there's no reason I can see from your description of your problem to use them in the first place...
Answered By - Joe Kington
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.