Linear regression Part 1

We will discuss the linear regression. For simplicity, we will divide the class into three parts. In the first part, we will discuss the cost function, in the 2nd part we will discuss the gradient descent algorithm and in the 3rd part, we will code a linear regression problem in TensorFlow.

Linear regression is the simplest form of Machine Learning(ML) model. Here, we will try to understand and implement our first machine learning algorithm in TensorFlow(TF). The two major components in every ML algorithm are (a) Cost Function and (b) Optimization method. Therefore, before diving into the implementation of the linear regression model in TensorFlow, let us first understand about the Cost Function.

Cost Function: Let us consider an example of object size vs. price list. Here, X is the object size and P is the corresponding price.

In [1]:
X = [1, 2, 3, 4, 5]

In [2]:
Y = [10, 40, 50, 78, 83]


This is a linear model, where the price is going up with the size of the object. However, the data in the list are scattered. Here we want to make a model (linear fit) which will closely predict the price of an object for any given size.

Let us make a hypothesis of prediction as H = aX + b. Here a and b are the parameters that will shape the prediction accuracy. Here we will try to understand how these two parameters are going to affect the prediction model.

1st case: Let us consider the value of a = 0 and b = 1.5

In [3]:
import numpy as np
from IPython.display import display, Math, Latex

In [4]:
import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerLine2D

In [5]:
a = 0
b = 1.5
X = np.array(X)

In [6]:
H = a*X + b # The hypothesis for the prediction model

In [7]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H, marker ='o', label='Model')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


2nd case: Let us consider the value of a = 10 and b = 0

In [8]:
a = 10
b = 0
X = np.array(X)
H = a*X + b

In [9]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H, label='Model')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


3rd Case: Let us consider the value of a = 10 and b = 20

In [10]:
a = 10
b = 20
X = np.array(X)
H = a*X + b

In [11]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H, label='Model')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


As we can see that our liner hypothesis H = aX + b is very much depended on the parameters a and b. However, the model is not connected with the original data (this was just for demonstration). That means the model is independent of price values.

Here we want to make a model that will be optimized for the original data, i.e. the values of a and b will be adjusted based on the original data (values of X and Y).

For this purpose, we have to use an error function or cost function.

In this case, we will use a simple cost function, called square error function. This function is very popular in solving regression problems

Let us consider this cost function as C. Where …

In [12]:
display(Math(r'C(a, b) =\frac 1{2m} \sum_{i=1}^m (H_i - Y_i)^2'))

$$C(a, b) =\frac 1{2m} \sum_{i=1}^m (H_i – Y_i)^2$$

Here Hi = aXi + b

For simplicity, we will first evaluate the cost function for the linear model H = aX + 0, i.e. b = 0 for all the time. Also, we will evaluate the cost function manually in steps

Case1: For b = 0 and a = 6

In [13]:
a = 6
b = 0
X = np.array(X)
Y = np.array(Y)
H6 = a*X + b

In [14]:
C6 = (np.sum(np.square(H6 - Y)))/2*len(X)

In [15]:
#plt.subplot(211)
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H6, label='Model with a = 6')
plt.title('X vs Y')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


Case2: For b = 0 and a = 12

In [16]:
a = 12
b = 0
H12 = a*X + b

In [17]:
C12 = (np.sum(np.square(H12 - Y)))/2*len(X)

In [18]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H12, label='Model with a = 12')
plt.title('X vs Y')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


Case3: For b = 0 and a = 18

In [19]:
a = 18
b = 0
H18 = a*X + b

In [20]:
C18 = (np.sum(np.square(H18 - Y)))/2*len(X)

In [21]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H18, label='Model with a = 18')
plt.title('X vs Y')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


Case4: For b = 0 and a = 24

In [22]:
a = 24
b = 0
H24 = a*X + b

In [23]:
C24 = (np.sum(np.square(H24 - Y)))/2*len(X)

In [24]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H24, label='Model with a = 24')
plt.title('X vs Y')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()


Case5: For b = 0 and a = 30

In [25]:
a = 30
b = 0
H30 = a*X + b

In [26]:
C30 = (np.sum(np.square(H30 - Y)))/2*len(X)

In [27]:
line1, = plt.plot(X,Y, "ro", markersize=5, label='Original Data')
line2, = plt.plot(X,H30, label='Model with a = 30')
plt.title('X vs Y')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(handler_map={line1: HandlerLine2D(numpoints=4)})
plt.show()

In [28]:
A = [6, 12, 18, 24, 30]
C = [C6, C12, C18, C24, C30]

In [29]:
line1, = plt.plot(A,C, label="Cost Function", linestyle='--')
plt.legend(handles=[line1], loc=1)
plt.title('a vs C')
plt.xlabel('a')
plt.ylabel('C')
plt.show()


Our objective is to find the minimum value of cost function. From the above graphs, it is clear that the minimum value of the cost function can be find for a = 18.
However, we have manually evaluated this. We need an automated method to find out the minimum of a cost function. That is where the role of optimization algorithm comes. Next, we are going to discuss the optimization algorithm.

References:

[1] Machine Learning by Andrew Ng.

[2] TensorFlow tutorial