Discussion:
Early stopping
Daniel Scott
2007-03-29 11:32:05 UTC
Permalink
Hi,

I'm trying to implement early stopping using the FANN library in C++.
From reading the mailing list, I believe that this can be accomplished
by monitoring the MSE of a validation dataset in the print_callback
function.

I have used the user_data pointer to point to a struct which contains a
FANN::training_data object containing the validation dataset. I evaluate
the MSE of the validation dataset in the print_callback function:

'data' is the user_data parameter which is a struct containing the
validation dataset. I evaluate the MSE of the validation dataset using
the following code:

cout << "Validation dataset MSE:" << net.test_data(data->validation);

Is this correct? Does this code alter the weights of the ANN in any way?

When I run my code, the MSE of the validation dataset seems to remain
fairly constant and does not behave as I would expect (Initial decrease
and the ANN learns the general input-output relationships and then an
increase as the ANN begins to memorise the training dataset). The MSE of
the validation dataset does not change significantly throughout the
training process.

Am I going about this the right way?

Does anyone have a good idea for some test data which I can use to
ensure that the process is working correctly. I have tried using an ANN
to learn a sum function (The output is simply the sum of the inputs) is
this a good test?

Thanks for your time,

Dan Scott

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Adrian Spilca
2007-03-29 12:53:16 UTC
Permalink
Post by Daniel Scott
Hi,
I'm trying to implement early stopping using the FANN library in C++.
From reading the mailing list, I believe that this can be accomplished
by monitoring the MSE of a validation dataset in the print_callback
function.
I have used the user_data pointer to point to a struct which contains a
FANN::training_data object containing the validation dataset. I evaluate
'data' is the user_data parameter which is a struct containing the
validation dataset. I evaluate the MSE of the validation dataset using
cout << "Validation dataset MSE:" << net.test_data(data->validation);
Is this correct? Does this code alter the weights of the ANN in any way?
When I run my code, the MSE of the validation dataset seems to remain
fairly constant and does not behave as I would expect (Initial decrease
and the ANN learns the general input-output relationships and then an
increase as the ANN begins to memorise the training dataset). The MSE of
the validation dataset does not change significantly throughout the
training process.
Am I going about this the right way?
Does anyone have a good idea for some test data which I can use to
ensure that the process is working correctly. I have tried using an ANN
to learn a sum function (The output is simply the sum of the inputs) is
this a good test?
Thanks for your time,
Dan Scott
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
I suspect the problem you are trying to solve is too easy for your ANN
and the network has learnt to generate the correct output in a few
epochs. The MSE on validation set is not going up because I suspect the
MSE on training set is not going down either (i.e. the network doesn't
learn anymore, it has already done it, it has memorised the training set).

Why don't you pick one of the problems in benchmarks (there are nice
training sets there), split the training set into (my suggestion)
training 70%, validation (10%), test (20%) or so.

It is important that the validation set is not identical with (or part
of) the training set. And also I'm not sure one can/should use early
stopping if the problem can be solved to a 100% accuracy using an ANN.

Hope this helps,
Adrian

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Loading...