Test Set

What Does Test Set Mean?

A test set in machine learning is a secondary (or tertiary) data set that is used to test a machine learning program after it has been trained on an initial training data set. The idea is that predictive models always have some sort of unknown capacity that needs to be tested out, as opposed to analyzed from a programming perspective.


A test set is also known as a test data set or test data.

Techopedia Explains Test Set

Many experts would say that a best practice is to have a test data set that is “sequestered” or kept to the end of the process. Engineers look for overfitting of the model and other issues in the training process. Ideally, there is a third set, a validation data set, that tests the classifier parameters. Then, and only then, the test set can be brought out to see how well the program was trained and whether its predictive model is accurate on new data. Although some models may avoid creating a partitioned test set altogether, this is often seen as shortsighted, because a lack of practical testing can leave a program prone to inaccuracy.


Related Terms

Margaret Rouse

Margaret is an award-winning technical writer and teacher known for her ability to explain complex technical subjects to a non-technical business audience. Over the past twenty years, her IT definitions have been published by Que in an encyclopedia of technology terms and cited in articles by the New York Times, Time Magazine, USA Today, ZDNet, PC Magazine, and Discovery Magazine. She joined Techopedia in 2011. Margaret's idea of a fun day is helping IT and business professionals learn to speak each other’s highly specialized languages.