Issue
I'm trying to write a function that shifts a regression trendline vertically so that it goes through the lowest datapoint. This seemed simple at first till I realised that the slope of the line dictates which point would end being furthest away.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import linregress
x = np.arange(0,10,2)
y = np.random.random(5)
res = linregress(x,y)
plt.figure(figsize=(9,4),dpi=450)
plt.plot(x, y, 'o', label='original data')
plt.plot(x, res.intercept + res.slope*x, 'r', label='fitted line')
plt.legend()
plt.show()
Image of the regression line that this code generated:
Image of the line after I manually shifted its intercept value:
Any help with this would be appreciated.
Solution
You already know how to get the y-value of any point on the line given its x-value: by applying the line equation (y = slope * x + intercept), as you did for plotting the line. You can thus calculate the residuals, which are the differences between the actual y-value of each point and its y-value as estimated by the regression line. To shift the line so that it goes through the lowest point, as measured by vertical distance from the line, add the smallest (i.e. most negative) residual to each y-value:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import linregress
np.random.seed(42)
x = np.arange(0, 10, 2)
y = np.random.random(5)
res = linregress(x, y)
plt.figure(figsize=(9,4),dpi=450)
plt.plot(x, y, 'o', label='original data')
plt.plot(x, res.slope * x + res.intercept, 'r', label='fitted line')
residuals = [b - (res.slope * a + res.intercept) for a, b in zip(x, y)]
shift = np.min(residuals)
plt.plot(x, res.slope * x + res.intercept + shift,
label='shifted line', color='blue')
plt.legend()
plt.show()
Answered By - Arne
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.