In this blog we would be implementing Artificial Neural Network for solving problems. We would be implementing two ANN’s first in Keras and then second using PyTorch.

Keras

Problem Statement: We are given a dataset having 13 different features of an individual’s such as age, sex, and 1 target feature having values 0 & 1. We have 303 such different examples. The target variable represents whether a suffer person suffers from any heart disease or not. It is 1 if he does and otherwise 0.

All the features are explained in the table below.

We first read and store the features in a DataFrame named dataset. We then shuffle the dataset as the 0 targets and 1 targets are grouped together.

First 13 columns are different features while the last i.e. 14th column is the target.

The set of features is having different scales. So we have to scale the features.

Now we will be implementing the ANN

First we import the necessary dependencies.

Input Layer

Our input is a vector of having 13 features. So we create a tensor of shape (13,)

We name the input layer i1

Hidden Layers

We will create two hidden layers having 10 nodes each. We will be using relu activation function.

We name the first hidden layer o1 and the second o2

Output Layer

Since our target is either 0 or 1. So we create our output layer having a single node. We apply a sigmoid activation function as we want our output two be probabilities of either using 0 or 1.

We will name the output layer pred.

Model

Now we create an instance of the Model class

Compiling the model

Now we will be compiling the model. We will be using the Adam optimizer and binary_crossentropy as our loss and metrics as accuracy.

We need to reshape the target tensor before feeding into the neural network.

Our current shape of y is (303,)

Now we will be fitting or training the model.

We give our input as X and output y. We split X and y for validation and training in the ratio 25 to 75. We train the model for 50 epochs with a batch size of 10. We again shuffle the dataset.

We kept aside the test set earlier.

Predicting

We make a prediction on the X_test and store it in y_pred.

Confusion Matrix

We then create a confusion matrix on our test set for further analysis.

### PyTorch

Problem Statement: We are given a dataset having different features of a customer. We want to predict whether a customer will leave the blank or stay.

Dataset description:

The dataset has following attributes

1. RowNumber

2. CustomerId

3. Surname

4. CreditScore

5. Age

6. Tenure

7. Balance

8. NumOfProducts

9. HasCrCard

10. IsActiveMember

11. EstimatedSalary

12. Exited

We want to predict whether a customer will leave the blank or stay.

We start by importing necessary dependencies:

import numpy as np
import torch
import pandas as pd

dataset = pd.read_csv('Churn_Modelling.csv')

Let’s view and understand our dataset

dataset.head()

All the features in the dataset are shown above. We see that attributes RowNumber ,CustomerId and Surname are of no significant use to the model. So will discard them and consider other.

Now we declare our feature set(X) and target set (Y)

X = dataset.iloc[:,3 :11].values
y = dataset.iloc[:,-1].values

We set the index -1 because the target value is present in the end of the dataset.

We have 8 features and 10000 training examples in our dataset.

print(X.shape)
print(y.shape)

Now we will split our dataset into training and testing.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

We saw in our dataset that the values had large margin. So we need to perform feature scaling. In this method, we scale down the feature values in a predetermined range. This causes the Gradient Decent to find quickly the global minimum.

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Now we will be building the Artificial Neural Network Model.

First start by importing necessary PyTorch libraries:

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

Then we declare some hyperparameters of the model:

#hyperparameters
hl = 10           #number of nodes in hidden layer
lr = 0.01         #learning rate of model
num_epoch = 5000   #number of epochs to train the model

Now we will create the model class:

#build model
class Net(nn.Module):

def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(8, 16)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(16, 2)

def forward(self, x):
out = self.fc1(x)
out = self.relu(out)
out = self.fc2(out)
return out

In our first ANN layer, we set input size to 8 as it is the number of features available.

The next layer is our output layer. The output size is 2 as we have 2 classes (0 & 1).  We apply the Relu activation function

Now we create an instance of the model class:

net = Net()

Now we will define our loss function and our optimizer:

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=lr)

Now we start the training phase:

for epoch in range(num_epoch):
X = Variable(torch.Tensor(X_train).float())
Y = Variable(torch.Tensor(y_train).long())

#feedforward - backprop
out = net(X)
loss = criterion(out, Y)
loss.backward()
optimizer.step()

if (epoch) % 50 == 0:
print ('Epoch [%d/%d] Loss: %.4f'
%(epoch+1, num_epoch, loss.item()))

First we convert X_train and y_test to PyTorch tensors. Then we set our gradients to zero as PyTorch accumulates the gradients on subsequent backward passes. Then we make a predictions and store in out. Further we calculate the loss on it using the loss function specified earlier. Then using the backward() function we backpropagate the Neural Network. The step() function updates the parameters using the gradients calculated. Then we print the loss after every 50th epoch.

Now we will test our ANN on the test set that we made before:

First we convert our input test array to a PyTorch tensor.

X_test = Variable(torch.Tensor(X_test).float())
y_test = Variable(torch.Tensor(y_test).float())

Then we make a prediction tensor and store it in output

outputs = net(X_test)
_, predicted = torch.max(outputs.data, 1)

The torch.max function gives the index of the maximum value.

Hence the predicted contains 0 and 1. Now we will test it.

correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images:{}%'
.format(100 * correct / total))


So we got a test accuracy of 85%.

$${}$$