Issue
I have two lists of y values:
y_list1 = [45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan]
y_list2 = [4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]
and both of these values were obtained at a set of time points:
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
The aim: Return y_list1 and y_list2 with the np.nans replaced with values, by fitting a polynomial regression to the data that is there, and then calculating the missing points.
I am able to fit the polynomial:
import sys
import numpy as np
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
id_list = ['1','2']
list_y = np.array([[45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan],[4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]]
for each_id,y in zip(id_list,list_y):
#treat the missing data
idx = np.isfinite(x) & np.isfinite(y)
#fit
ab = np.polyfit(x[idx], y[idx], len(list_y[0]))
So then I wanted to use this fit to replace the missing values in y, so I found this, and implemented:
replace_nan = np.polyval(x,y)
print(replace_nan)
The output is:
[2.13161598e+20 nan nan nan
5.20634185e+19 7.52453405e+20 8.35884417e+09 3.27510000e+04
5.11358666e+10 nan nan nan
nan nan]
test_polyreg.py:16: RankWarning: Polyfit may be poorly conditioned
ab = np.polyfit(x[idx], y[idx], len(list_y[0])) #understand how many degrees
[7.45653990e+07 6.97736286e+16 nan nan
nan nan nan 9.91821285e+08
nan nan nan nan
nan nan]
I'm not concerned about the poor conditioning warning because this is just test data to try understand how it should work, but the output still has nans in it (and didn't use the fit I'd previously generated), could someone should be how to replace the nans in the y values with points estimated from a polynomial regression?
Solution
first you should modify the ab
definition as:
ab = np.polyfit(x[idx], np.array(y)[idx], idx.sum())
ab
are your polynomial coefficients, so you have to pass them to np.polyval
as:
replace_nan = np.polyval(ab,x)
print(replace_nan)
out:
[ 4. 23. 26.54413638 28.01419869 27.00250156
23.10135965 15.90308758 5. -10.01558845 -29.55136312
-54.01500938 -83.81421259 -119.3566581 -161.05003127]
Answered By - Salvatore Daniele Bianco
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.