Issue
For example:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0.0,1.2,0.2)
y = np.arange(0.0,1.2,0.2)
labels = np.arange(0.0,1.2,0.2)
plt.plot(x, y)
plt.xticks(x, labels)
plt.show()
I had to use np.around(np.arange(0.0, 1.2, 0.2),1)
to avoid it, but if I just run np.arange(0.0,1.2,0.2)
it gives: array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
, why is it different?
Also, the y axis do not use 0.60...01
as label, which is also weird.
This issue is due to IEEE 754 float precision, and I think it should have a good solution to round decimal numbers.
Solution
The same floating point representation that represents 0.6, matches a whole interval of real numbers. So all real numbers from 0.59999999999999993 to 0.60000000000000003 share the same float64 representation.
Just try it:
import struct
struct.pack('d', 0.59999999999999992)
struct.pack('d', 0.59999999999999993)
struct.pack('d', 0.59999999999999994)
struct.pack('d', 0.59999999999999995)
struct.pack('d', 0.59999999999999996)
struct.pack('d', 0.59999999999999997)
struct.pack('d', 0.59999999999999998)
struct.pack('d', 0.59999999999999999)
struct.pack('d', 0.60000000000000000)
struct.pack('d', 0.60000000000000001)
struct.pack('d', 0.60000000000000002)
struct.pack('d', 0.60000000000000003)
struct.pack('d', 0.60000000000000004)
As you can see, all, but the first and last number, have the same representation.
But that is not the only problem. Because, that float64 object that represents any real between 0.59993 and 0.6003, is represented by python with the "roundest" number of that interval. Namely, 0.6
. This is why when you type 0.6
in your python interpreter, it doesn't reply 0.59999999999999993 nor 0.59999999999999999. (Or, that would have been an easiest way to test that struct
— but I wanted to introduce struct
—, why when you type 0.59999999999999994, python replies 0.6, but when you type 0.59999999999999992, it says 0.5999999999999999)
The problem is that 0.2 neither have an exact representation.
All real numbers from 0.19999999999999998 0.20000000000000002 share the same representation. And that representation is only the exact representation of 0.20000000000000001110223024625156540423631668090820...
I know this because:
import struct
b=struct.pack('d', 0.2)
x=struct.unpack('l', b)[0]
exponent=(x>>52)&(2**11-1) # 1020 aka -3
mantissa=x&(2**52-1) # 2702159776422298
mantissa+=2**52 # Add the implicit 1 of float64
# Check, mantissa/2**52*2**-3 should be ~0.2
mantissa/2**52*2**(exponent-1023) # 0.2
# To know the rest of the digits that python float64 can't show,
# I take advantage of the infinite range of integers of python, and compute
# that times 10**50
# using exact integer operations
10**50*mantissa//(2**(52+1023-exponent))
# 20000000000000001110223024625156540423631668090820
)
Now, I you multiply that number by 3, you get 0.60000000000000003330669073875469621270895004272460...
Which is greater than 0.60000000000000003
In other words, 0.2*3 and 0.6 doesn't have the same float64 representation.
Now, when a numpy array is printed, it is a bit rounded.
np.array([1.234567890123])
⇒
array([1.23456789])
This is just a display choice of numpy (which can be tweaked, btw, with set_printoptions
). The way __repr__
method works.
You can check that
np.array([1.234567890123])[0]
⇒
1.234567890123
Which is why you didn't see the numerical error when printing the range.
All digits are there. They are just not printed by numpy array's __repr__
.
Same goes for 0.6
np.arange(0,1.2,0.2)
#array([0. , 0.2, 0.4, 0.6, 0.8, 1. ])
np.arange(0,1.2,0.2)[3]
#0.6000000000000001
As for how to avoid it:
- You can do what you did. Round the numbers a bit. Which would turn that into a 0.6 (again, not an exact one. But at least a number whose float64 representation is the same as 0.6, and therefore would be printed as
"0.6"
) - do nothing. And remove your
xticks
specification. The behaviour you expect is already the default one - If the problem is that, for some reason, the default behaviour is different on your machine (different resolution of something), and not all ticks are printed, use
xticks
to impose the ticks position, but do not set label, and let the default formater choose how they are printed (so choose which ticks are printed, not how)plt.xticks(x)
- If on the contrary, it is not which ticks are printed, but how they are printed that bothers you with the default behavior, you can set the formatter that you like
import matplotlib.ticker as tk
plt.gca().xaxis.set_major_formatter(tk.FormatStrFormatter('%.2f'))
I chose voluntarily to use 2 decimals to see the difference with the default. - Of course, you can do both: choose with 1-argument
xticks
where to print labels, and withformatter
how to print them. - Lastly, as already suggested while I was typing this answer, if you need to use 2-arguments
xticks
to fix both ticks and their label (but I think that should be avoided, because that is redoing formatter job. I do that only when I need some exotic labels. Such asxticks(x, ['zero', '1/5', '40%', '3/5', '80%', 'full'])
), then pass explicit strings as label (what would be the point of redoing formatter's job, if it is to still not choose yourself how to print the non-string object you passed?)
plt.xticks(x, [f'{t:.2f}' for t in x])
Answered By - chrslg
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.