Issue
I am getting a flat regression even with a 10th degree regresor. But If I change the date vaues to numeric then the regression works! Anybody knows why?
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LinearRegression
from scipy.optimize import curve_fit
## RESHAPE DATA ##
X = transformed_data.ds.values.reshape(-1, 1)
y = transformed_data.y
# X = data.fecha.dt.day.values.reshape(-1, 1)
## PLOT ##
fig, ax = plt.subplots(figsize=(15,8))
ax.plot(X, y, 'o', label="data")
for i in (range(1, 10)):
polyreg = make_pipeline(PolynomialFeatures(i), LinearRegression())
polyreg.fit(X, y)
mse = round(np.mean((y - polyreg.predict(X))**2))
mae = round(np.mean(abs(y - polyreg.predict(X))))
ax.plot(X, polyreg.predict(X), label='Degree: ' + str(i) + ' MSE: ' + f'{mse:,}' +' MAE: ' + f'{mae:,}')
Datetime Data
ds y
0 2019-01-10 3658.0
1 2019-01-11 2952.0
2 2019-01-12 2855.0
3 2019-01-13 3904.0
Numeric Data
ds y
0 10 3658.0
1 11 2952.0
2 12 2855.0
3 13 3904.0
Solution
Linear Regression imply the associating of numerical values to a calculated coefficient. What happens next is that the values are multiplied by the coefficients, which in turn gives you an output which is used for predictions.
BUT, in your case, one of the variables is a date and, as explained above, the regression model doesn't know what to do with it. As you noticed, you need to convert them to numerical data.
Answered By - Serge de Gosson de Varennes
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.