Issue
I just came across one of these Kernels and couldn't understand what does numpy.log1p()
do in the third pipeline of this code (House Prediction dataset in Kaggle).
Numpy documentation said
Returns:
- An array with natural logarithmic value of x + 1
- where x belongs to all elements of input array.
What is the purpose of finding log with one added while finding skewness of original and transformed array of same features? What does it actually do?
Solution
For real-valued input,
log1p
is accurate also forx
so small that1 + x == 1
in floating-point accuracy.
So for example let's add a tiny non-zero number and 1.0
. Rounding errors make it a 1.0
.
>>> 1e-100 == 0.0
False
>>> 1e-100 + 1.0 == 1.0
True
If we try to take the log
of that incorrect sum, we get an incorrect result (compare to WolframAlpha):
>>> np.log(1e-100 + 1)
0.0
But if we use log1p()
, we get the correct result
>>> np.log1p(1e-100)
1e-100
The same principle holds for exp1m()
and logaddexp()
: The're more accurate for small x
.
Answered By - Nils Werner
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.