Issue
I am performing multiple polynomial regression using sklearn. What I cannot understand is how can I get the full polynomial formula? Is the order in printed coef_
correct? I am trying to put together a correct regression equation but nothing works.
I have a code here where I get the predicted values, coefficients and intercept.
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
Y = df['Y']
X = df[['X1', 'X2']]
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
poly.fit(X_poly, Y)
lin2 = LinearRegression()
model = lin2.fit(X_poly, Y)
y_pred = model_3.predict(X_poly)
print(y_pred)
print('Regression coefficients: ', model.coef_)
print('Intercept: ', model_3.intercept_)
Regression coefficients: [0.0 -3.9407245056806457 63.36152983871869 -0.0073134316780316105 0.28728821270355437 -1.8955885488237727 -317.773937549386]
Intercept: 40.587981548779965
Let's say that X1 = 167.8 and X2 = 22.348595, after the regression the predicted value is 361.67, but none of the version of equation is not giving the result of 361.67.
I find that coef_
prints [1, a, b, c, a^2, b^2, c^2, ab, bc, ca]
, so in this case [1, a, b, a^2, b^2, ab]
, but I am not sure that the sequence here is correct. I am not getting 361.67, but 370.56 with this:
y = 0.0 + -3.94 * X1 + 63.36 * X2 + -0.007 * X1^2 + 0.2872 * X1 * X2 + -1.895 * X2^2 + -317.77
Solution
I do not believe there is anything wrong with formula or the order, it is just that rounding the decimals will make a difference from your prediction by a more significant amount than you expected.
If you put in all decimal places in the regression coefficients in the order you originally have, you will get the correct predicted value of 361.67 I believe.
Please let me know if there is anything wrong or if I misinterpreted the issue.
For example:
X1 = 167.8
X2 = 22.348595
y = 0.0 + -3.9407245056806457 * X1 + 63.36152983871869 * X2 + -0.0073134316780316105 * X1**2 + 0.28728821270355437 * X1 * X2 + -1.8955885488237727 * X2**2 -317.773937549386
print(y)
Output:
361.67832067451957
Answered By - Richard K Yu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.