Issue
I'm trying to make a 3D graph to plot the fitted regression surface. I've seen the following examples.
Plot linear model in 3d with Matplotlib
Combining scatter plot with surface plot
Best fit surfaces for 3 dimensional data
However, the first one is very outdated and no longer working, and the second one is related but I'm having some troubles to generate the values for Z
.
All the examples I can found are either outdated or low-level simulated data examples. There might be more issues than Z
.
Please take a look at the following code.
import numpy as np
import seaborn as sns
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
df = sns.load_dataset('mpg')
df.dropna(inplace=True)
model = smf.ols(formula='mpg ~ horsepower + acceleration', data=df)
results = model.fit()
x, y = model.exog_names[1:]
x_range = np.arange(df[x].min(), df[x].max())
y_range = np.arange(df[y].min(), df[y].max())
X, Y = np.meshgrid(x_range, y_range)
# Z = results.fittedvalues.values.reshape()
fig = plt.figure(figsize=plt.figaspect(1)*3)
ax = plt.axes(projection='3d')
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha = 0.2)
Update:
I changed Z
to the following is right
Z = results.params[0] + X*results.params[1] + Y*results.params[2]
and append
ax.scatter(df[x], df[y], df[model.endog_names], s=50)
ax.view_init(20, 120)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
I got the following plot, but I'm not sure if it's right.
If possible, I'd also like to add projections for the plotted surface.
Solution
In the answer you linked the critical step is the application of the model to the entire meshgrid via supplying the 'exogenous' data. In this case you can do that easily by creating a new dataframe containing the unraveled meshgrid and passing it as exog
to statsmodels.regression.linear_model.OLS.predict
. A demonstration of this using your example:
import numpy as np
import seaborn as sns
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
df = sns.load_dataset('mpg')
df.dropna(inplace=True)
model = smf.ols(formula='mpg ~ horsepower + acceleration', data=df)
results = model.fit()
x, y = model.exog_names[1:]
x_range = np.arange(df[x].min(), df[x].max())
y_range = np.arange(df[y].min(), df[y].max())
X, Y = np.meshgrid(x_range, y_range)
exog = pd.DataFrame({x: X.ravel(), y: Y.ravel()})
Z = results.predict(exog = exog).values.reshape(X.shape)
fig = plt.figure(figsize=plt.figaspect(1)*2)
ax = plt.axes(projection='3d')
ax.scatter(df[x].values, df[y].values, results.fittedvalues.values,
marker='.', label="Fits")
cond = df[model.endog_names].values > results.fittedvalues.values
ax.scatter(df[x][cond].values, df[y][cond].values, df[model.endog_names]
[cond].values, label="Raw")
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha = 0.4)
ax.scatter(df[x][cond == False].values, df[y][cond == False].values,
df[model.endog_names][cond == False].values)
ax.legend()
plt.show()
Which will give you
I have included the individual fit points in the scatter plot in addition to the datapoints to indicate this approach correctly generates the corresponding surface. I also filtered the data into two groups: those which should be plotted in front of the surface and those which should be plotted behind the surface. This is to accommodate for matplotlib's layering of artists in 3D rendering. The viewing geometry has been changed from default in an attempt to maximize the clarity of the 3D properties.
Edit
Adding a projection of the regression surface onto one of the axes planes is fairly trivial - you simply plot the data with one dimension set to the axis limit, i.e.
ax.plot_surface(X, Y, np.full_like(X, ax.get_zlim()[0]), alpha = 0.2)
Which then gives you
Answered By - William Miller
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.