Artificial Neural Networks

is a biological stimulated network of artificial neurons designed to perform

specific tasks 1

. An artificial neural network (ANN), are simulations performed on computers to

execute particular tasks such as classification, pattern recognition, and

clustering, amongst others. 1

ANNs are comprised of

various nodes which can accept input data and execute simple operations on the

input data. The nodes imitate biological neurons in a human brain. The outcome

of the operations made by the nodes is then succeeded to other neurons. The

production at each node is referred to activation

or node value. 2

Each

link is associated with a weight. An ANN is competent of learning by means of varying

the values of the weights 2.

The following diagram exemplifies a simple ANN:

The input layer contains neurons that accept input on which the network

will learn. The output layer is the

last layer which comprehends of the units that responds to the information

about how the network learned the tasks. The job of the middle layer, denoted

as the hidden layer, is to convert

the input into something useful, that the output unit can use. 1

A popular learning algorithm

used in Neural Network is backpropagation.

The backpropagation is the training algorithm 2

and it is an addition of gradient based delta learning rule 1. If an error is

found, it is propagated backward from output to the input layer via the hidden

layer 1.

The algorithm modifies the weights of the network in order to deliver desired

output for a specific input on concluding the learning 2.

The objective of this

assignment is to learn how ANNs work and how to implement them using a

high-level language. An ANN that learns a Boolean function is implemented. The

neural network has five input neurons, four hidden neurons and three output

neurons. The sigmoid function transfer function is used. An Error Back

Propagation algorithm is implemented in order to train weights. Lastly a graph

is plotted using the library ‘matplotlib’. This assignment is implemented using

the high-level language Python.

Variable

Description

I

Represents

the input dataset matrix, where each row in the matrix is a training example.

O

Represents

the output dataset matrix, where each row in the matrix is a training

example.

lay0

This

is the first Layer of the Neural Network, which is specified by the input

data.

lay1

This

is the second Layer of the Neural Network, this is also known as the hidden

layer.

lay2

This

is the third layer of the Neural Network, this shows the output data.

W0

This

is the first layer of weights, Synapse 0, connecting lay0 to lay1.

W1

This

is the second layer of synapses, connecting lay1 to lay2.

*

Multiplication

so that two vectors of equivalent size are multiplying conforming values

one-to-one in order to produce an ultimate vector of indistinguishable size.

–

Subtraction

so that two vectors of identical size are subtracting matching values

one-to-one to create an ending value of equal size.

I.dot(O)

If

I and O are vectors, this is a dot product. If both are matrices, it is a

matrix-matrix multiplication else, if one is a matrix, then it would be a

vector matrix multiplication.

Importing

Package import numpy as npThis line of code

imports the linear algebra library, ‘numpy’. import matplotlib.pyplot as pltWe also import the

matplotlib as plt in order to plot the graph.The

Sigmoid Functiondef nonlin(x, deriv=False): if(deriv==True): return(x*(x-1)) return 1/(1*np.exp(-x))In the above code the

sigmoid function is represented, where if true is passed then the derivative of

the sigmoid would be calculated, this is one of the necessary assets of the

sigmoid function that the outputs of the function can be used to generate its derivative,

If the output of the sigmoid is ‘out’, then

the derivative would be .

However, if false is passed then the derivative is not going to be calculated.

The derivative is needed when the error is being calculated in the

backpropagation. The sigmoid function will be run in every neuron. A sigmoid

function maps any value to another value which is between 0 and 1. A sigmoid

function is defined by the following formula: It is a

mathematical function and has a curve shaped like an ‘S’ as shown in the below

figure:

The Input

DataI = np.array(0,0,0,0,0, 0,1,1,0,0, 1,0,0,1,1, 1,1,1,1,1)Here we are

initialising the input dataset as a numpy matrix. Each column relates to one of

the input nodes, hence there are four input nodes to the network and five

training examples.The Output

DataO = np.array(0,0,0, 1,0,0, 1,0,1, 0,1,0)This sets the output

dataset, where the data is being produced horizontally with three rows and four

columns. Each row represents a training example and each column represents an

output node, therefore, the network has three inputs and one output. Seedingnp.random.seed(1)Seeding is done

in order to start at the dame point each time the neural network runs. This is

done as it would be easier to see how the changes affect the neural network.Creating

WeightsW0 = 2*np.random.random((5,3)) – 1W0 = 2*np.random.random((5,4)) – 1W1 = 2*np.random.random((4,3)) – 1The variable ‘W0’

represents the weight matrix, which would be a five by three matrix. Later on

the weights of the first layer are generated randomly again in order to

generate a five by four matrix. The second layer of weights is a four by three

matrix of weights. The one at the end is the bias. Trainingfor iter in range(epochs):A for loop is used to

iterate numerous times in order to augment the network to the dataset. Here we

are continuously inputting our data and updating the weights over time to

backpropagation.Layerslay0 = Ilay1 = nonlin(np.dot(lay0, W0))lay2

= nonlin(np.dot(lay1, W1))Lay0 is our input layer and

it is the first layer of the Neural Network. Here we are doing a matrix

multiplication where we are multiplying the synapse, which is also known as the

weight, by the layers and the result is then passed through the sigmoid

function.Backpropagationlay2_error = O – lay2if

(k % 10000) == 0:print (‘Error:’ + str(np.mean(np.abs(lay2_error))))In the backpropagation, we

try to reduce the error each time the loop is run. The error we try to reduce

is where the prediction is inaccurate. The guess lay2 is subtracted from the

true answer O and the answer is stored in lay2_error which will show how well

the network did. Printing is done every then thousand step to see how well the

network is doing.The following screenshot

shows how the output of the errors would look when the program is run. Since

the “epochs” variable is set to 60000 and the errors are being executed every

1000, a total of six errors should be shown. Errors should be getting closer to

zero the closer the iterations get to 60000.Delta Calculationslay2_delta = lay2_error *

nonlin(lay2, deriv=True)lay1_error = lay2_delta.dot(W1.T)lay1_delta

= lay1_error * nonlin(lay1, deriv=True)Delta is the difference in

the quantity every time the loop is run. Here we are calculating deltas as the

data moves through the layers as the sigmoid function is applied to all of the

layers.Updating of Weights/SynapsesW1 += lay1.T.dot(lay2_delta) W0

+= lay0.T.dot(lay1_delta)This line of code calculates

the synapse updates for each synapse for each training example.Printing of the Outputprint ()print (‘The final output after training: ‘)print

(lay2)Together with the error

printing every then thousand steps, we are also printing out the final output

which would be stored in lay2.The result is shown in the

following figure.Plotting the Graphplt.plot(lay2_delta,

nonlin(lay2,deriv=True), ‘ro’)plt.show()The final step was to plot

the graph. Here we are plotting the deltas of the final layer as the data moves

through the layers and the sigmoid function is applied to it against the

sigmoid function. The yielded graph is shown

in the following screenshot. The “lay2_delta” is on the x-axis and the

“nonlin(lay2, deriv=Ture)” is on the y-axis: Source Code

In

this section one would find the source code of the program that has been built

for the neural network and is as explained in the previous sections.

import numpy as npimport matplotlib.pyplot as plt # Number

of iterationsepochs =

60000 #setting

the sizes for the input layer, hidden layer and the output layer respectivelyinputLayerSize,

hiddenLayerSize, outputLayerSize = 5, 4, 3 #—Part

1—##

sigmoid function#if true

is passed then the derivative of the sigmoid would be calculated#if

false is passed it is not going to be calculated. The derivative is needed when

the #error

is being calculated in the backpropagation#This is

the sigmoid function that is going to be run in every neurondef nonlin(x,deriv=False): if(deriv==True): return x*(1-x) return 1/(1+np.exp(-x)) # input

dataset & output datasetI = np.array(0,0,0,0,0, 0,1,1,0,0,

1,0,0,1,1, 1,1,1,1,1)O = np.array( 0,0,0,

1,0,0, 1,0,1, 0,1,0) #seed

random numbers to make calculation#seeding

to start at the same point each time (good for debugging)np.random.seed(1) #—Part

2—-##creating

weights/synapses#a five

by four matrix and the one is the biasW0 = 2*np.random.random((5,4)) – 1#A four

by three matrix and the one is the biasW1 = 2*np.random.random((4,3)) – 1 #Training#continuously

inputting our data and updating the weights over time to backpropagationfor j in range(epochs): #layers # Feed forward

through layers 0, 1, and 2 lay0 = I#this is matrix multiplication,

multiplying the synapse W0 by layer 0 and the synapse W1 by layer 1 lay1 = nonlin(np.dot(lay0,W0)) lay2 = nonlin(np.dot(lay1,W1)) #backpropagation #tries to reduce the error each time the

loop is run #where the prediction is bad lay2_error = O – lay2 #printing every

10000 steps to see how well it is doing if (j% 10000) == 0:#this will print out our error,

using the absolute value function of

numpy#to make sure that it is a positive number. After this is done#we would get the mean of that and print it as a string print (“Error:

” + str(np.mean(np.abs(lay2_error)))) #Delta calculations#The difference in the quantity

every time and we are going to

calculate#as data moves through the layers as

the sigmoid function is applied to #all of them elementary

multiplication is done between the layer and the#derivative, which is set to true,

the result from this multiplication is#then multiplied by the layer error.

The multiplication for the slopes with#the error results in reducing the

error of high confidence predictions lay2_delta = lay2_error*nonlin(lay2,deriv=True) lay1_error = lay2_delta.dot(W1.T) lay1_delta = lay1_error *

nonlin(lay1,deriv=True) #updating the weights/synapses W1 += lay1.T.dot(lay2_delta) W0 += lay0.T.dot(lay1_delta) print ()print (“The final output after training:”)print

(lay2) #—Part

3=–##plotting

graphplt.plot(lay2_delta, nonlin(lay2,deriv=True), ‘ro’)

plt.show()