Issue
My Question is highly related to math domain error while using PCA
I get the following error:
File "$path$\Python\Python36\lib\site-packages\sklearn\decomposition\pca.py", line 88, in _assess_dimension_(1. / spectrum_[j] - 1. / spectrum_[i])) + log(n_samples)
ValueError: math domain error
which refers to this line of code :
pa += log((spectrum[i] - spectrum[j]) * (1. / spectrum_[j] - 1. / spectrum_[i])) + log(n_samples)
After looking closer i found out that the problem is caused by this part of the equation:
(spectrum[i] - spectrum[j])
which results in 0 if these values are equal. This leads to a multiplication by 0 which results in a log(0) what causes this exception.
Now my question. Is the fact this error can occur a sign that my data is bad or should the implementation handle this case? If the implementation should handle this, what way would you recommend to handle this properly? In the linked question there is already an answer to this but it doesn't look very confident to be right and hasn't any feedback.
Created an issue on the github repo of scikit-learn containing steps to reproduce the error.
Solution
This is due to an open issue inside sklearn. This is confirmed here
Answered By - Yannic Klem
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.