Discussion:
Cross Validation
Chris Spencer
2007-03-09 14:35:59 UTC
Permalink
Is there any way to perform cross-validation during training with
either the backprop or cascade correlation methods? I don't see any
explicit methods for this in the docs. I see the short paragraph
"Avoid Over-Fitting" on the advanced usage page, but it doesn't
actually describe how to do this.

Regards,
Chris

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Josh Menke
2007-03-09 15:18:40 UTC
Permalink
It's not built in to fann, but you can do it yourself. With ANNs I don't
usually use cross-validation, but instead just a single validation set. This
is mostly due to the size of the sets I use though (millions).

--Josh
Post by Chris Spencer
Is there any way to perform cross-validation during training with
either the backprop or cascade correlation methods? I don't see any
explicit methods for this in the docs. I see the short paragraph
"Avoid Over-Fitting" on the advanced usage page, but it doesn't
actually describe how to do this.
Regards,
Chris
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
--
Joshua Menke
Machine Learning Scientist
Trust and Safety Applied Research
ebay, Inc
josh-***@public.gmane.org
Chris Spencer
2007-03-09 15:46:40 UTC
Permalink
But how do you do it? Do you just call train_on_data until test_data
gives a small enough MSE for your validation set?

Regards,
Chris
Post by Josh Menke
It's not built in to fann, but you can do it yourself. With ANNs I don't
usually use cross-validation, but instead just a single validation set. This
is mostly due to the size of the sets I use though (millions).
--Josh
Post by Chris Spencer
Is there any way to perform cross-validation during training with
either the backprop or cascade correlation methods? I don't see any
explicit methods for this in the docs. I see the short paragraph
"Avoid Over-Fitting" on the advanced usage page, but it doesn't
actually describe how to do this.
Regards,
Chris
-------------------------------------------------------------------------
Post by Chris Spencer
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share
your
Post by Chris Spencer
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Post by Chris Spencer
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
--
Joshua Menke
Machine Learning Scientist
Trust and Safety Applied Research
ebay, Inc
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Josh Menke
2007-03-09 15:59:46 UTC
Permalink
The basic idea is you train once on all your data, and then test on your
validation set. Then you monitor whichever performance metric you care
about. MSE, accuracy, cross-entropy, precision, recall, preferably cost if
you can.

If fann doesn't support your performance metric, then you need to either
write a test call back, or your own test_on_data function.

Then, you stop training when your performance stops improving. How you
define that is also up to you. I do something like, if there has been no
improvement in 100 epochs, I randomly choose from the best nets I've seen to
tie-break.

I've wanted to try something like a sign test to measure statistical
significance between epochs, but then I got into Bayesian methods where you
don't need a validatoin set.

--Josh
Post by Chris Spencer
But how do you do it? Do you just call train_on_data until test_data
gives a small enough MSE for your validation set?
Regards,
Chris
Post by Josh Menke
It's not built in to fann, but you can do it yourself. With ANNs I don't
usually use cross-validation, but instead just a single validation set.
This
Post by Josh Menke
is mostly due to the size of the sets I use though (millions).
--Josh
Post by Chris Spencer
Is there any way to perform cross-validation during training with
either the backprop or cascade correlation methods? I don't see any
explicit methods for this in the docs. I see the short paragraph
"Avoid Over-Fitting" on the advanced usage page, but it doesn't
actually describe how to do this.
Regards,
Chris
-------------------------------------------------------------------------
Post by Josh Menke
Post by Chris Spencer
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to
share
Post by Josh Menke
your
Post by Chris Spencer
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Post by Josh Menke
Post by Chris Spencer
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
--
Joshua Menke
Machine Learning Scientist
Trust and Safety Applied Research
ebay, Inc
-------------------------------------------------------------------------
Post by Josh Menke
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share
your
Post by Josh Menke
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Post by Josh Menke
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
--
Joshua Menke
Machine Learning Scientist
Trust and Safety Applied Research
ebay, Inc
josh-***@public.gmane.org
Vincenzo Di Massa
2007-03-09 18:40:08 UTC
Permalink
you can just use FANNTrainer... it does cross validation.
just look at the archive.

Vincenzo
Post by Josh Menke
The basic idea is you train once on all your data, and then test on your
validation set. Then you monitor whichever performance metric you care
about. MSE, accuracy, cross-entropy, precision, recall, preferably cost if
you can.
If fann doesn't support your performance metric, then you need to either
write a test call back, or your own test_on_data function.
Then, you stop training when your performance stops improving. How you
define that is also up to you. I do something like, if there has been no
improvement in 100 epochs, I randomly choose from the best nets I've seen
to tie-break.
I've wanted to try something like a sign test to measure statistical
significance between epochs, but then I got into Bayesian methods where you
don't need a validatoin set.
--Josh
Post by Chris Spencer
But how do you do it? Do you just call train_on_data until test_data
gives a small enough MSE for your validation set?
Regards,
Chris
Post by Josh Menke
It's not built in to fann, but you can do it yourself. With ANNs I
don't usually use cross-validation, but instead just a single
validation set.
This
Post by Josh Menke
is mostly due to the size of the sets I use though (millions).
--Josh
Post by Chris Spencer
Is there any way to perform cross-validation during training with
either the backprop or cascade correlation methods? I don't see any
explicit methods for this in the docs. I see the short paragraph
"Avoid Over-Fitting" on the advanced usage page, but it doesn't
actually describe how to do this.
Regards,
Chris
-------------------------------------------------------------------------
Post by Josh Menke
Post by Chris Spencer
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to
share
Post by Josh Menke
your
Post by Chris Spencer
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Post by Josh Menke
Post by Chris Spencer
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
--
Joshua Menke
Machine Learning Scientist
Trust and Safety Applied Research
ebay, Inc
-------------------------------------------------------------------------
Post by Josh Menke
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share
your
Post by Josh Menke
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Post by Josh Menke
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Fann-general mailing list
https://lists.sourceforge.net/lists/listinfo/fann-general
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
Loading...