So far, we have seen how to create a network and a dataset. To work with datasets and networks together, we have to do it with the help of trainers.
Below is a working example to see how to add a dataset to the network created, and later trained and tested using trainers.
from pybrain.tools.shortcuts import buildNetwork from pybrain.structure import TanhLayer from pybrain.datasets import SupervisedDataSet from pybrain.supervised.trainers import BackpropTrainer # Create a network with two inputs, three hidden, and one output nn = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer) # Create a dataset that matches network input and output sizes: norgate = SupervisedDataSet(2, 1) # Create a dataset to be used for testing. nortrain = SupervisedDataSet(2, 1) # Add input and target values to dataset # Values for NOR truth table norgate.addSample((0, 0), (1,)) norgate.addSample((0, 1), (0,)) norgate.addSample((1, 0), (0,)) norgate.addSample((1, 1), (0,)) # Add input and target values to dataset # Values for NOR truth table nortrain.addSample((0, 0), (1,)) nortrain.addSample((0, 1), (0,)) nortrain.addSample((1, 0), (0,)) nortrain.addSample((1, 1), (0,)) #Training the network with dataset norgate. trainer = BackpropTrainer(nn, norgate) # will run the loop 1000 times to train it. for epoch in range(1000): trainer.train() trainer.testOnData(dataset=nortrain, verbose = True)
To test the network and dataset, we need BackpropTrainer. BackpropTrainer is a trainer that trains the parameters of a module according to a supervised dataset (potentially sequential) by backpropagating the errors (through time).
We have created 2 datasets of class - SupervisedDataSet. We are making use of NOR data model which is as follows −
A | B | A NOR B |
---|---|---|
0 | 0 | 1 |
0 | 1 | 0 |
1 | 0 | 0 |
1 | 1 | 0 |
The above data model is used to train the network.
norgate = SupervisedDataSet(2, 1) # Add input and target values to dataset # Values for NOR truth table norgate.addSample((0, 0), (1,)) norgate.addSample((0, 1), (0,)) norgate.addSample((1, 0), (0,)) norgate.addSample((1, 1), (0,))
Following is the dataset used to test −
# Create a dataset to be used for testing. nortrain = SupervisedDataSet(2, 1) # Add input and target values to dataset # Values for NOR truth table norgate.addSample((0, 0), (1,)) norgate.addSample((0, 1), (0,)) norgate.addSample((1, 0), (0,)) norgate.addSample((1, 1), (0,))
The trainer is used as follows −
#Training the network with dataset norgate. trainer = BackpropTrainer(nn, norgate) # will run the loop 1000 times to train it. for epoch in range(1000): trainer.train()
To test on the dataset, we can use the below code −
trainer.testOnData(dataset=nortrain, verbose = True)
C:\pybrain\pybrain\src>python testnetwork.py Testing on data: ('out: ', '[0.887 ]') ('correct:', '[1 ]') error: 0.00637334 ('out: ', '[0.149 ]') ('correct:', '[0 ]') error: 0.01110338 ('out: ', '[0.102 ]') ('correct:', '[0 ]') error: 0.00522736 ('out: ', '[-0.163]') ('correct:', '[0 ]') error: 0.01328650 ('All errors:', [0.006373344564625953, 0.01110338071737218, 0.005227359234093431 , 0.01328649974219942]) ('Average error:', 0.008997646064572746) ('Max error:', 0.01328649974219942, 'Median error:', 0.01110338071737218)
If you check the output, the test data almost matches with the dataset we have provided and hence the error is 0.008.
Let us now change the test data and see an average error. We have changed the output as shown below −
Following is the dataset used to test −
# Create a dataset to be used for testing. nortrain = SupervisedDataSet(2, 1) # Add input and target values to dataset # Values for NOR truth table norgate.addSample((0, 0), (0,)) norgate.addSample((0, 1), (1,)) norgate.addSample((1, 0), (1,)) norgate.addSample((1, 1), (0,))
Let us now test it.
C:\pybrain\pybrain\src>python testnetwork.py Testing on data: ('out: ', '[0.988 ]') ('correct:', '[0 ]') error: 0.48842978 ('out: ', '[0.027 ]') ('correct:', '[1 ]') error: 0.47382097 ('out: ', '[0.021 ]') ('correct:', '[1 ]') error: 0.47876379 ('out: ', '[-0.04 ]') ('correct:', '[0 ]') error: 0.00079160 ('All errors:', [0.4884297811030845, 0.47382096780393873, 0.47876378995939756, 0 .0007915982149002194]) ('Average error:', 0.3604515342703303) ('Max error:', 0.4884297811030845, 'Median error:', 0.47876378995939756)
We are getting the error as 0.36, which shows that our test data is not completely matching with the network trained.