A neural network is a computational system which tries to mimic the way human mind works. Neural Networks are used mostly for classification and clustering and regression problems with its application in almost every field. The neural network is a field of research and innovation. The neural networks and the artificial intelligence is shaping the new era with its immense application in the fields of medicine, finance, entertainment, etc.

The first artificial neural network was invented in 1958 Frank Rosenblatt. But at that time it didn’t gain much popularity owing to the fact of limited computation power at that time. But the NN again surged up with advancement of computer hardware.

Neural networks are basically some interconnected graphs with multiple
units called ** nodes. **Just like a brain a neural network has a lot of neurons
connected to each other. A neural network also has a lot of units or nodes
interconnected and divided in

*layers.*This computational system of neurons
is called as an ** Artificial Neural Network** or

**The ANN learns itself by modifying some of it values called**

*ANN.***and**

*weights*

*bias.*Multiple units of an ANN are arranged in the form of series of layers. The first layer of an ANN is called ** input layer** and the last layer is called

**while the intermediate layers are called**

*output layer***. There are unique values called**

*hidden layer***of each unit which are related between each node of the current layer to the each node of the adjacent layer. There is also a value called**

*weights***of each node.**

*bias*** The neural network learns by itself. **We will be able to understand this
line completely once we will study the working of the ANN. But for now we just
need to get an intuition that an ANN modifies it

**values to get an accurate and precise output.**

*weights and bias*The circles represent nodes and the black arrows represent connections between them.

(Image source: geekforgeeks.com)

w1, w2, w3,… represent weight values and b1 and b2 represent bias values of node.

**WORKING**

The working of a neural network can be divided into two methods: forward propagation and backward propagation.

**FORWORD PROPAGATION**

In a forward propagation (also called forward pass), we traverse from the input layer to the output layer and then we compare our results with the actual output. During this pass we calculate the value of each node. This is done by the formula:

Here activation is the activation function applied, w is the weight of the node, x is the input and b is the bias term. Activation function will be discussed later.

**BACKPROPAGATION**

In the case
of backpropagation we traverse back from the output layer to input layer, by
updating our weight values. The weights are updated by subtracting the
derivative of the ** activation function** applied. The goal of backpropagation is to
minimize

**. Loss is defined as the absolute difference between the predicted value and the original value.**

*loss*** ACTIVATION FUNCTION**: We use activation function in an ANN to introduce non-linearity.

**Activation function decides, whether a neuron should be activated or not by calculating weights.**

There are various types of activation functions. Some famous ones being:

- ReLu

- Tanh

- Sigmoid

- Leaky ReLu

*Example*

Let’s take a simple example of a neural network which predicts the XOR value of given inputs. This ANN will have two units in input layer, one hidden layer having three units and one output layer and having a single unit. We will be using a **sigmoid **activation function.

Now let’s make our own dataset:

X1 | X2 | Y |

0 | 0 | 0 |

0 | 1 | 1 |

1 | 0 | 1 |

1 | 1 | 0 |

Now let’s first forward pass. We calculate the value of the first node of the hidden layer using the formula described earlier

Likewise we
will do for all the units and find y_{1}, y_{2 ,}y_{3}.

Then we calculate the Neural Networks output as

Now comes the main part, the backward propagation.

We compare
our final output y with the actual output and find the difference between it
called *loss.*

Now we back propagate the neural network and update our weight values as follows :

Where m denotes the no of training examples. In our case it is 4.

We then find the derivative of the cost function with respect to each parameter:

We then update our weight and bias values as follows:

**α **here is called ** learning rate**. Learning rate is a number chosen randomly to decide the step size of the

**. We would study about**

*gradient descent***later but for now you need to understand neither it should be two big nor too small. Ideal values of**

*gradient descent***is 0.001, 0.01,0.0001,etc;**

*learning rate*We then perform forward and backward propagation for a number of cycles called as ** epochs. **We slowly see that our neural networks loss

**J**decreases steadily as number of epochs increase.

## 1 Comment

## David Kate · May 21, 2020 at 1:42 pm

Very good explanation