What is Perceptron
A perceptron can be represented as a function that takes some input and gives off some output. This function any perceptron performs is same in all scenario, no matter what that perceptron is intended to do.
A perceptron takes the inputs, which is an (n+1)-dimensional vector containing n input parameters and 1 bias variable and multiplies with the weight vector (an array of size n+1 and the first element being 1) and wraps it around an activation function and that becomes the output of the perception.
If that goes straight through your mind, let me explain.
No matter what’s the learning problem is, a machine learning algorithm always plots the inputs in a graph and try to find the optimum line that passes through this plotted points.
A perceptron matches this behavior by multiplying inputs and weights and activating the product. Mathematically, this will only work if the first element of the weight vector is 1 and the first element of the input vector is the bias parameter. Consider the following code, that demonstrates what you just read in this paragraph.
# W is for weights vector. # X is for input vector. X = [2,3,1,3,5,7,8,5] # Just some input vector. # The first element ,ie, X is the bias parameter W = np.random.rand(len(X)) w = 1 # This will initialize the W vector with random numbers # and sets the first element ,ie, W to 1
Now before we move further, let us talk about some machine learning jargon. We’ll start with
A “Hypothesis Function” is nothing but the mathematical function of the line or curve that fits the given data-set. A function that “fits” the data-set is the one that connects most of the points in the data-set with minimum error. The mathematical function can be demonstrated using the following code.
# np is an object of NumPy class def hypothesis(X,W): return np.matmul(X,np.transpose(W))
The above hypothesis function gives any real number which is great when you are doing some regression task but when you are performing a classification task, you have to enclose the result of X*W in some function that has a range from 0 to 1. One function that is common and used a lot for this task is called “Sigmoid function”, mathematical code for that is as follows:
# Y is the actual values # corresponding to X in the data-set def cost(X,Y): sum = 0 for i,x in enumerate(hypothesis(X)): sum += (x - Y[i])**2 return sum/(2*len(X))
The above code demonstrate the cost function for a regression task but if you want to find the error in a classification task, you have to consider the following code.
def cost(X,Y): sum = 0 for i,x in enumerate(hypothesis(X)): sum += -(Y[i]*np.log(sigmoid(x))+(1-Y[i])*np.log(1-sigmoig(x))) return sum/len(X)
If you want to learn in depth about cost functions, you can refer this video by Andrew Ng.
The below code demonstrates an algorithm to find the optimum weights for our data set that can plot a line which best divides or fits the training data set, X.
def getOptimumWeights(X,Y,learn_rate,epoch): W = np.rand.random(len(X)) W = 1 i = 0 while i > epoch: i += i error = cost(X,Y) for w in W[1:]: w = w - learn_rate*error return W
We first create a variable W and assign it to a list of random number with length equal to your number of training set. Then assign first element of W to 1. This is because
W corresponds to the Bias factor, that is, if you plot the hypothesis function as a graph then Bias factor decides the
y point at
x = 0.
Then we iteratively update the list
W according to the
learning rate and and
Learning Rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. Take a look at this video.
Now that is done, let’s concentrate on the actual perceptron algorithm, a simple rule of a perceptron or any neural network is take the input, multiply by weights, add the bias and activate. As stated above, we can simulate the add the bias step be fixing the first element of W as 1 and the first element of training data set, X as the bias. The need of activate step or activation is to add a sense of non-linearity to the line that we want to fit, this will improve the accuracy of our model to a great extend and our perceptron or neural network can learn a wide variety of things. Consider the following code.
# X*W is basically equal to X*W + b, # Remember take the input, multiply by weights, add the bias and activate def perceptron(X,W): # We are using sigmoid function to activate our hypothesis output = sigmoid(X*W) return output
before using this this function we need to do some pre-processing that is to train the weights vector W, remember the function we made called getOptimumWeights that function does this task. Theoretically, you can use any real number as the learning rate, but ideally, you have to use something that is neither too high nor too low. I always use 0.01 or 0.005 as the learning rate.
Using Perceptron as Logical Gates
The best thing about neural networks or perceptron is that you can teach a computer to do any task by giving it some inputs and corresponding outputs.
First, lets teach this perceptron to learn and tell the outputs of an AND gate.
X = [[0,0],[0,1],[1,0],[1,1]] Y = [0,0,0,1] W = getOptimumWeights(X,Y,0.005) output = perceptron(X,W) for o in output if o < 0.5: print(0) else: print(1)
We created X and Y variables as per above image and passed it through previously created optimisation function. This gave us the Weights to predict the outputs using perceptron.
The output for above pipeline will be as follows.
0 0 0 1
This way, you can make any logical gates and when you combine multiple perceptrons, you’ll end up with a full fledged Neural Network that can be used to learn large data sets such as stock market, or text emotions if you want to make a chat bot that looks just like human chats. You can make Siri or Cortana using Neural Networks. Once you start using NNs or Machine Learning the possibilities are endless.
That is all for this post. If you like my content, make sure to subscribe to this blog and also share this post to your tech community.