Modeling with Trees

Asher Khan
1 min readJan 4, 2021

This blog is a little bit about my third project as a Data Science student. For our third project, we explored different predictive models many of which were trees. We explored regression trees, decision trees, random forest, and bagging trees.

The model I chose as my final model was a Random Forest, for a couple of reasons. Firstly the data set that I was dealing with was greatly imbalanced with a 14:86 ratio.

One of the reasons I chose a random forest model is because Random Forest is less affected by an imbalanced data set than other models.

In short, the way a random forest works is that it builds multiple decision trees and merges them together to get a more accurate and stable prediction.

Also instead of searching for the most important feature while splitting a node, it searches for the best feature among a random subset of features. This results in a wide diversity that generally results in a better model.

For these reasons, I felt a random forest model would be the most suitable model in my case.

--

--