Linear regression is a statistical method that tries to show a relationship between variables. It looks at different data points and plots a trend line. Linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a modeler might want to relate the weights of individuals to their heights using a linear regression model. Simply stated, the goal of linear regression is to fit a line to a set of points. When the target variable that we’re trying to predict is continuous, we call the learning problem a regression problem. Let’s suppose we want to model the above set of points with a line. To do this we’ll use the standard y = mx + b line equation where m is the line’s slope and b is the line’s y-intercept. To find the best line for our data, we need to find the best set of slope m and y-intercept b values.
# Importing Required Library
import matplotlib.pyplot as plt
import numpy as np
// Creating data points
data = [[20,30], [40,50], [60,70], [90,80]]
data_x = [data[i][0] for i in range(4)]
data_y = [data[i][1] for i in range(4)]
// Initialize tangent (m) and intercept (c) to random number
c = 0;
errors = []
// Drawing a straight line (y=mx+c) form
for m in np.arange(0,2,.5):
x= list(range(100))
y = [(m*x)+c for x in x]
for i in range(4):
error+= data_y[i]- (m*data_x[i]+c)
print(f'total error = {error}')
errors.append(error)
error=0
plt.plot(data_x,data_y,'r*')
plt.plot(x,y)
print(f'm = {m}')
plt.show()
print(errors)
total error = 230.0
m = 0.0