What Does Decision Tree Mean?
A decision tree is a flowchart-like representation of data that graphically resembles a tree that has been drawn upside down. In this analogy, the root of the tree is a decision that has to be made, the tree’s branches are actions that can be taken and the tree’s leaves are potential decision outcomes.
The purpose of a decision tree is to partition a large dataset into subsets that contain instances with similar values in order to understand the likely outcomes of specific options.
In machine learning (ML), decision trees are used to predict the class or value of target variables in supervised learning (SL) regression and classification algorithms. Regression algorithms, also called continuous algorithms, use training data to predict all the future values of a specific data instance within a given period of time. In contrast, classification algorithms use training data to predict the value of a single data instance at a specific moment in time.
Decision trees are also referred to as CART trees, which is short for classification and regression trees.
Techopedia Explains Decision Tree
Decision trees are a popular and powerful tool used for classification and prediction purposes.
Decision trees can be either categorical or continuous/regressive. In a categorical decision tree, new data outcomes are based on a single, discrete variable. In contrast, continuous decision tree outcomes are based on previous decision node outcomes. The accuracy of decision trees can be increased by combining the results of a collection of decision trees.
How Decision Trees Work
Decision trees are constructed by analyzing a set of labeled training examples and applying the analysis to previously unseen examples. When decision trees are trained with high-quality data, they can make very accurate predictions.
Visually, decision trees are made up of a decision node that forms the root of the tree. This is followed by tree branches (called edges) that point to additional decision nodes. Each decision node either classifies a new data point or makes a prediction about its future value. The tree’s branches (edges) direct data to the next decision node and eventually the final outcome, which is represented by a leaf.
Classification Decision Trees
Each question in a classification tree is contained in a parent node, and each parent node points to a child node for each possible answer to its question. This type of decision tree essentially forms a hierarchy of questions with binary answers (yes/no; true/false).
Regression Decision Trees
Regression trees seek to determine the relationship between a single, dependent variable and a series of independent variables that split off from the initial data set. This is important because it means that regression decision tree outcomes will be based on multiple variables.
Decision Tree Pruning
Decision tree algorithms add decision nodes incrementally, using labeled training examples to guide the choice of new decision nodes.
Pruning is an important step that involves spotting and deleting data points that are outside the norm. The goal of pruning is to prevent outliers from skewing results by giving too much weight to unimportant data .