Here, we will understand about training the Neural Network in CNTK.
In the previous section, we have defined all the components for the deep learning model. Now it is time to train it. As we discussed earlier, we can train a NN model in CNTK using the combination of learner and trainer.
In this section, we will be defining the learner. CNTK provides several learners to choose from. For our model, defined in previous sections, we will be using Stochastic Gradient Descent (SGD) learner.
In order to train the neural network, let us configure the learner and trainer with the help of following steps −
Step 1 − First, we need to import sgd function from cntk.lerners package.
from cntk.learners import sgd
Step 2 − Next, we need to import Trainer function from cntk.train.trainer package.
from cntk.train.trainer import Trainer
Step 3 − Now, we need to create a learner. It can be created by invoking sgd function along with providing model’s parameters and a value for the learning rate.
learner = sgd(z.parametrs, 0.01)
Step 4 − At last, we need to initialize the trainer. It must be provided the network, the combination of the loss and metric along with the learner.
trainer = Trainer(z, (loss, error_rate), [learner])
The learning rate which controls the speed of optimisation should be small number between 0.1 to 0.001.
from cntk.learners import sgd from cntk.train.trainer import Trainer learner = sgd(z.parametrs, 0.01) trainer = Trainer(z, (loss, error_rate), [learner])
Once we chose and configured the trainer, it is time to load the dataset. We have saved the iris dataset as a .CSV file and we will be using data wrangling package named pandas to load the dataset.
Step 1 − First, we need to import the pandas package.
from import pandas as pd
Step 2 − Now, we need to invoke the function named read_csv function to load the .csv file from the disk.
df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’, index_col=False)
Once we load the dataset, we need to split it into a set of features and a label.
Step 1 − First, we need to select all rows and first four columns from the dataset. It can be done by using iloc function.
x = df_source.iloc[:, :4].values
Step 2 − Next we need to select the species column from iris dataset. We will be using the values property to access the underlying numpy array.
x = df_source[‘species’].values
As we discussed earlier, our model is based on classification, it requires numeric input values. Hence, here we need to encode the species column to a numeric vector representation. Let’s see the steps to do it −
Step 1 − First, we need to create a list expression to iterate over all elements in the array. Then perform a look up in the label_mapping dictionary for each value.
label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}
Step 2 − Next, convert this converted numeric value to a one-hot encoded vector. We will be using one_hot function as follows −
def one_hot(index, length): result = np.zeros(length) result[index] = 1 return result
Step 3 − At last, we need to turn this converted list into a numpy array.
y = np.array([one_hot(label_mapping[v], 3) for v in y])
The situation, when your model remembers samples but can’t deduce rules from the training samples, is overfitting. With the help of following steps, we can detect overfitting on our model −
Step 1 − First, from sklearn package, import the train_test_split function from the model_selection module.
from sklearn.model_selection import train_test_split
Step 2 − Next, we need to invoke the train_test_split function with features x and labels y as follows −
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0-2, stratify=y)
We specified a test_size of 0.2 to set aside 20% of total data.
label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2}
Step 1 − In order to train our model, first, we will be invoking the train_minibatch method. Then give it a dictionary that maps the input data to the input variable that we have used to define the NN and its associated loss function.
trainer.train_minibatch({ features: X_train, label: y_train})
Step 2 − Next, call train_minibatch by using the following for loop −
for _epoch in range(10): trainer.train_minbatch ({ feature: X_train, label: y_train}) print(‘Loss: {}, Acc: {}’.format( trainer.previous_minibatch_loss_average, trainer.previous_minibatch_evaluation_average))
from import pandas as pd df_source = pd.read_csv(‘iris.csv’, names = [‘sepal_length’, ‘sepal_width’, ‘petal_length’, ‘petal_width’, index_col=False) x = df_source.iloc[:, :4].values x = df_source[‘species’].values label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2} def one_hot(index, length): result = np.zeros(length) result[index] = 1 return result y = np.array([one_hot(label_mapping[v], 3) for v in y]) from sklearn.model_selection import train_test_split x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0-2, stratify=y) label_mapping = {‘Iris-Setosa’ : 0, ‘Iris-Versicolor’ : 1, ‘Iris-Virginica’ : 2} trainer.train_minibatch({ features: X_train, label: y_train}) for _epoch in range(10): trainer.train_minbatch ({ feature: X_train, label: y_train}) print(‘Loss: {}, Acc: {}’.format( trainer.previous_minibatch_loss_average, trainer.previous_minibatch_evaluation_average))
In order to optimise our NN model, whenever we pass data through the trainer, it measures the performance of the model through the metric that we configured for trainer. Such measurement of performance of NN model during training is on training data. But on the other hand, for a full analysis of the model performance we need to use test data as well.
So, to measure the performance of the model using the test data, we can invoke the test_minibatch method on the trainer as follows −
trainer.test_minibatch({ features: X_test, label: y_test})
Once you trained a deep learning model, the most important thing is to make predictions using that. In order to make prediction from the above trained NN, we can follow the given steps−
Step 1 − First, we need to pick a random item from the test set using the following function −
np.random.choice
Step 2 − Next, we need to select the sample data from the test set by using sample_index.
Step 3 − Now, in order to convert the numeric output to the NN to an actual label, create an inverted mapping.
Step 4 − Now, use the selected sample data. Make a prediction by invoking the NN z as a function.
Step 5 − Now, once you got the predicted output, take the index of the neuron that has the highest value as the predicted value. It can be done by using the np.argmax function from the numpy package.
Step 6 − At last, convert the index value into the real label by using inverted_mapping.
sample_index = np.random.choice(X_test.shape[0]) sample = X_test[sample_index] inverted_mapping = { 1:’Iris-setosa’, 2:’Iris-versicolor’, 3:’Iris-virginica’ } prediction = z(sample) predicted_label = inverted_mapping[np.argmax(prediction)] print(predicted_label)
After training the above deep learning model and running it, you will get the following output −
Iris-versicolor