In this blog we would be implementing Artificial Neural Network for solving problems. We would be implementing two ANN’s first in Keras and then second using PyTorch.
Keras
Problem Statement: We are given a dataset having 13 different features of an individual’s such as age, sex, and 1 target feature having values 0 & 1. We have 303 such different examples. The target variable represents whether a suffer person suffers from any heart disease or not. It is 1 if he does and otherwise 0.
All the features are explained in the table below.
age : age in years |
sex (1 = male; 0 = female) |
cp chest pain type |
trestbps resting blood pressure (in mm Hg on admission to the hospital) |
chol serum cholestoral in mg/dl |
fbs (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false) |
restecg resting electrocardiographic results |
thalach maximum heart rate achieved |
exang exercise induced angina (1 = yes; 0 = no) |
oldpeak ST depression induced by exercise relative to rest |
slope the slope of the peak exercise ST segment |
ca number of major vessels (0-3) colored by flourosopy |
thal 3 = normal; 6 = fixed defect; 7 = reversable defect |
target 1 or 0 |
We first read and store the features in a DataFrame named dataset. We then shuffle the dataset as the 0 targets and 1 targets are grouped together.
First 13 columns are different features while the last i.e. 14^{th} column is the target.
The set of features is having different scales. So we have to scale the features.
Now we will be implementing the ANN
First we import the necessary dependencies.
Input Layer
Our input is a vector of having 13 features. So we create a tensor of shape (13,)
We name the input layer i1
Hidden Layers
We will create two hidden layers having 10 nodes each. We will be using relu activation function.
We name the first hidden layer o1 and the second o2
Output Layer
Since our target is either 0 or 1. So we create our output layer having a single node. We apply a sigmoid activation function as we want our output two be probabilities of either using 0 or 1.
We will name the output layer pred.
Model
Now we create an instance of the Model class
Compiling the model
Now we will be compiling the model. We will be using the Adam optimizer and binary_crossentropy as our loss and metrics as accuracy.
We need to reshape the target tensor before feeding into the neural network.
Our current shape of y is (303,)
Now we will be fitting or training the model.
We give our input as X and output y. We split X and y for validation and training in the ratio 25 to 75. We train the model for 50 epochs with a batch size of 10. We again shuffle the dataset.
We kept aside the test set earlier.
Predicting
We make a prediction on the X_test and store it in y_pred.
Confusion Matrix
We then create a confusion matrix on our test set for further analysis.
PyTorch
Problem Statement: We are given a dataset having different features of a customer. We want to predict whether a customer will leave the blank or stay.
Dataset description:
The dataset has following attributes
1. RowNumber
2. CustomerId
3. Surname
4. CreditScore
5. Age
6. Tenure
7. Balance
8. NumOfProducts
9. HasCrCard
10. IsActiveMember
11. EstimatedSalary
12. Exited
We want to predict whether a customer will leave the blank or stay.
We start by importing necessary dependencies:
import numpy as np
import torch
import pandas as pd
We load our dataset
dataset = pd.read_csv('Churn_Modelling.csv')
Let’s view and understand our dataset
dataset.head()
All the features in the dataset are shown above. We see that attributes RowNumber ,CustomerId and Surname are of no significant use to the model. So will discard them and consider other.
Now we declare our feature set(X) and target set (Y)
X = dataset.iloc[:,3 :11].values
y = dataset.iloc[:,-1].values
We set the index -1 because the target value is present in the end of the dataset.
We have 8 features and 10000 training examples in our dataset.
print(X.shape)
print(y.shape)
Now we will split our dataset into training and testing.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
We saw in our dataset that the values had large margin. So we need to perform feature scaling. In this method, we scale down the feature values in a predetermined range. This causes the Gradient Decent to find quickly the global minimum.
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
Now we will be building the Artificial Neural Network Model.
First start by importing necessary PyTorch libraries:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
Then we declare some hyperparameters of the model:
#hyperparameters
hl = 10 #number of nodes in hidden layer
lr = 0.01 #learning rate of model
num_epoch = 5000 #number of epochs to train the model
Now we will create the model class:
#build model
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(8, 16)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(16, 2)
def forward(self, x):
out = self.fc1(x)
out = self.relu(out)
out = self.fc2(out)
return out
In our first ANN layer, we set input size to 8 as it is the number of features available.
The next layer is our output layer. The output size is 2 as we have 2 classes (0 & 1). We apply the Relu activation function
Now we create an instance of the model class:
net = Net()
Now we will define our loss function and our optimizer:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=lr)
Now we start the training phase:
for epoch in range(num_epoch):
X = Variable(torch.Tensor(X_train).float())
Y = Variable(torch.Tensor(y_train).long())
#feedforward - backprop
optimizer.zero_grad()
out = net(X)
loss = criterion(out, Y)
loss.backward()
optimizer.step()
if (epoch) % 50 == 0:
print ('Epoch [%d/%d] Loss: %.4f'
%(epoch+1, num_epoch, loss.item()))
First we convert X_train and y_test to PyTorch tensors. Then we set our gradients to zero as PyTorch accumulates the gradients on subsequent backward passes. Then we make a predictions and store in out. Further we calculate the loss on it using the loss function specified earlier. Then using the backward() function we backpropagate the Neural Network. The step() function updates the parameters using the gradients calculated. Then we print the loss after every 50^{th} epoch.
Now we will test our ANN on the test set that we made before:
First we convert our input test array to a PyTorch tensor.
X_test = Variable(torch.Tensor(X_test).float())
y_test = Variable(torch.Tensor(y_test).float())
Then we make a prediction tensor and store it in output
outputs = net(X_test)
_, predicted = torch.max(outputs.data, 1)
The torch.max function gives the index of the maximum value.
Hence the predicted contains 0 and 1. Now we will test it.
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10000 test images:{}%'
.format(100 * correct / total))
So we got a test accuracy of 85%.
0 Comments