Computational Methods for Data Analysis 2015/16

Computational Methods for Data Analysis – 2015/16

Lab 6: Neural Networks in R

In this lab we’ll see how to train neural nets for classification and look at various issues to take into account.

1. Training a perceptron over the cheese dataset

In this first of the lab we will train a perceptron using the package nnet.[1]

Loading nnet

You should start by loading the nnet library into your R environment using library (and installing the package using install.packages if you haven’t installed it yet).

Loading the cheese dataset

In the first exercise we’ll use the cheese dataset, containing information about taste, acetic component, lactic component, etc of several types of cheese.

Exercise: Download the cheese data from the Lab6 directory on the course website.

Exercise: Load the cheese data into a variable called cheese in your workspace.

Exploring the data

Exercise: Use the commands

cheese

names(cheese)

summary(cheese)

to get a feel for the data.

Create and analyze a perceptron

With the package nnet, you fit a neural net using the function nnet(). nnet() takes as arguments

- a formula specifying output and input variables (just as in linear regression)

- a parameter data specifying the data frame to be used

- a parameter size specifiying the number of hidden units

- a parameter skip specifying whether the inputs are directly connected with the outputs or now

- a parameter linout specifying whether a linear activation function is desired.

(see http://cran.r-project.org/web/packages/nnet/nnet.pdf for full details of all the arguments)

In this first part of the lab we will use nnet to predict with no hidden layer and linear activation function

For example, the following command creates a perceptron with three input variables, one output variable, and using a linear activation function instead of the logistic function.

> fitnn1 = nnet(taste ~ Acetic + H2S + Lactic, cheese, size=0, skip=TRUE, linout=TRUE)

> summary(fitnn1)

In the output of summary(), i1, i2 and i3 are the input nodes (so acetic, H2S and lactic respectively); o is the output node (taste); and b is the bias. The values under the links are the weights.

Evaluating the predictions of a perceptron

Let us now use the J cost function implemented in Lab 1 to evaluate the model:

J = function (model) {

return(mean(residuals(model) ^ 2) / 2)

}

Exercise Compute J of fitnn1

Let us now compare this to the fit we get by doing standard least squares regression:

fitlm = lm(taste ~ Acetic + H2S + Lactic, cheese)

summary(fitlm)

Exercise Compute J of fitlm

2. Adding a Hidden Layer With a Single Node

We now add a hidden layer in between the input and output neurons, with a single node.

> fitnn3 = nnet(taste ~ Acetic + H2S + Lactic, cheese, size=1, linout=TRUE)

During backpropagation you should see values like the following:

# weights: 6

initial value 25372.359476

final value 7662.886667

converged

You can visualize the structure of the network created by nnet using the function plot.nnet(), in the devtools library

library(devtools)

The following encantation enables devtools:

source_url('https://gist.githubusercontent.com/fawda123/7471137/raw/466c1474d0a505ff044412703516c34f1a4684a5/nnet_plot_update.r')

We can now plot the architecture of fitnn3 with the following command:

plot.nnet(fitnn3)

(NB plot.nnet() doesn’t seem to work with perceptrons)

3. Prediction with Neural Nets

In this exercise we are going to use part of our data for training, part for testing.

First of all, download the Titanic.csv dataset into a variable called data. This dataset provides information on the fate of passengers on the maiden voyage of the ocean liner ‘Titanic’. Explore the dataset.

Then split the dataset into a training set consisting of 2/3 the data and a test set consisting of the other 1/3 using the method seen in Lab 4.

n <- length(data[,1])

indices <- sort(sample(1:n, round(2/3 * n)))

train <-data[indices,]

test <- data[-indices,]

The nnet package requires that the target variable of the classification (in this case, Survived) be a two-column matrix: one column for No, the other for Yes — with a 1 in the No column for the items to be classified as No, and a 1 in the Yes column for the items to be classified as 0:

No Yes

Yes  0 1

No  1 0

Yes  0 1

Etc. Luckily, the class.ind utility function in nnet does this for us:

train$Surv = class.ind(train$Survived)

test$Surv = class.ind(test$Survived)

(Use head to see what the result is on train for instance.) We can now use train to fit a neural net. The softmax parameter specifies a logistic activation function.

fitnn = nnet(Surv~Sex+Age+Class, train, size=1, softmax=TRUE)

fitnn

summary(fitnn)

And then test its performance on test:

table(data.frame(predicted=predict(fitnn, test)[,2] > 0.5,
actual=test$Surv[,2]>0.5))

[1] This part of the lab follows closely the presentation in

http://www.louisaslett.com/Courses/Data_Mining/ST4003-Lab5-Introduction_to_Neural_Networks.pdf