Section 10: Random Forests & Boosted Trees

UpdatedOctober 1, 2021

Sections 9 and 10 are on tree-based methods. There are three main methods:

Each of these methods stems from the basic decision tree algorithm. Fundamentally, tree-based methods rely on the ability to split data based on information from features. Require a mathematical definition of information and the ability to measure it.

Classification and Regression Tree (CART) introduces many concepts:

Cross validation of Trees
Pruning Trees
Surrogate Splits
Variable Importance Scores
Search for Linear Splits

References:

An Introduction to Statistical Learning (Download free pdf)
Jose Portilla's 2021 Python for Machine Learning & Data Science Masterclass

#machine-learning

Comments

Join the discussion

No comments yet. Be the first to comment.

Machine Learning

Part 15 of 20

The aim of the series is to consolidate a foundational understanding of Machine Learning.

Up next

Section 11: Naive Bayes & Natural Language Processing (NLP)

References: An Introduction to Statistical Learning (Download free pdf) Jose Portilla's 2021 Python for Machine Learning & Data Science Masterclass

More from this blog

8 tips when building a high fidelity prototype in Figma

Resources: Figma's documentation of best practices Figma's Community Page Do browse the community page for UI kits to quickly bootstrap your project. Decide early in the project the color palette and fonts to facilitate efficient collaborative de...

May 10, 2022

Section 15: Principal Component Analysis (PCA)

References: An Introduction to Statistical Learning (Download free pdf) Jose Portilla's 2021 Python for Machine Learning & Data Science Masterclass

Oct 2, 2021

Section 14: Density-Based Spatial Clustering of Applications with Noise (DBSCAN)

References: An Introduction to Statistical Learning (Download free pdf) Jose Portilla's 2021 Python for Machine Learning & Data Science Masterclass

Oct 2, 2021

Section 13: Hierarchical Clustering

References: An Introduction to Statistical Learning (Download free pdf) Jose Portilla's 2021 Python for Machine Learning & Data Science Masterclass

Oct 1, 2021

Section 12: K-Means Clustering

So far we have done supervised learning. The remaining sections will be on unsupervised learning. Below is a quick guide on how to pick the estimator: Source: scikit-learn Unsupervised Learning: (1) Clustering: Using features, group together data r...

Sep 30, 2021

Nur Fadhilah

21 posts

Command Palette

Comments

Machine Learning

Section 11: Naive Bayes & Natural Language Processing (NLP)

More from this blog