Issue
print(n1)
print(n2)
print(type(n1), type(n2))
print(scipy.stats.spearmanr(n1, n2))
print(n1.corr(n2, method="spearman"))
0 2317.0
1 2293.0
2 1190.0
3 972.0
4 1391.0
Name: r6000, dtype: float64
0.0 2317.0
1.0 2293.0
3.0 1190.0
4.0 972.0
5.0 1391.0
Name: 6000, dtype: float64
<class 'pandas.core.series.Series'> <class 'pandas.core.series.Series'>
SpearmanrResult(correlation=0.9999999999999999, pvalue=1.4042654220543672e-24)
0.7999999999999999
The problem is that scipy was reporting a different correlation value than pandas.
Edit to add:
The issue is the indexes are off. Pandas does automatic intrinsic data alignment, but scipy doesn't. I've answered it below.
Solution
I made a copy and called reset_index() on the series before correlating them. That fixed it.
The issue is intrinsic automatic data alignment in pandas based on the indexes.
scipy library doesn't do automatic data alignment, likely just converts it to a numpy array.
Answered By - Blaze
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.