Extra-trees
Extra Trees (Extremely Randomized Trees) is an ensemble machine learning algorithm that is based on decision trees. It is similar to the Random Forest algorithm, but with a few key differences in the way the decision trees are constructed.
Like Random Forest, Extra Trees also creates a collection of decision trees, each of which is trained on a different subset of the data. The key difference is that in Extra Trees, at each split in the decision tree, instead of selecting the best split point from all the available features, a random subset of features is selected and the best split point is chosen from this subset. This process is known as random subspace method.
Another difference is that in Extra Trees, instead of using a measure of impurity such as Gini Index or Entropy to select the best split, the best split point is chosen at random among the set of split points that decrease impurity.
These two techniques of random subspace method and random split point selection make the Extra Trees algorithm more random and less correlated than Random Forest which leads to less overfitting.
Extra Trees algorithm is efficient and easy to use, and it can handle high dimensional data with many features. It is particularly useful in cases where the data has a high degree of noise or where the individual models have high variance or are prone to overfitting.
Like Random Forest, Extra Trees also creates a collection of decision trees, each of which is trained on a different subset of the data. The key difference is that in Extra Trees, at each split in the decision tree, instead of selecting the best split point from all the available features, a random subset of features is selected and the best split point is chosen from this subset. This process is known as random subspace method.
Another difference is that in Extra Trees, instead of using a measure of impurity such as Gini Index or Entropy to select the best split, the best split point is chosen at random among the set of split points that decrease impurity.
These two techniques of random subspace method and random split point selection make the Extra Trees algorithm more random and less correlated than Random Forest which leads to less overfitting.
Extra Trees algorithm is efficient and easy to use, and it can handle high dimensional data with many features. It is particularly useful in cases where the data has a high degree of noise or where the individual models have high variance or are prone to overfitting.
Updated on: 27/01/2023
Thank you!