Machine Learning

Software Engineering for Data Scientists (New book!)

Very excited to announce the early-access preview (MEAP) of my upcoming book, Software Engineering for Data Scientists is available now!…

1 year ago

How to plot XGBoost trees in R

In this post, we're going to cover how to plot XGBoost trees in R. XGBoost is a very popular machine…

3 years ago

How is information gain calculated?

This post will explore the mathematics behind information gain. We'll start with the base intuition behind information gain, but then…

4 years ago

Evaluate your R model with MLmetrics

This post will explore using R's MLmetrics to evaluate machine learning models. MLmetrics provides several functions to calculate common metrics…

4 years ago

How to get an AUC confidence interval

Background AUC is an important metric in machine learning for classification. It is often used as a measure of a…

5 years ago

How to build a logistic regression model from scratch in R

Background In a previous post, we showed how using vectorization in R can vastly speed up fuzzy matching. Here, we…

6 years ago

ICA on Images with Python

Click here to see my recommended reading list. What is Independent Component Analysis (ICA)? If you're already familiar with ICA,…

6 years ago