How to get an AUC confidence interval

Background

AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For a detailed explanation of AUC, see this link.

Since AUC is widely used, being able to get a confidence interval around this metric is valuable to both better demonstrate a model’s performance, as well as to better compare two or more models. For example, if model A has an AUC higher than model B, but the 95% confidence interval around each AUC value overlaps, then the models may not be statistically different in performance. We can get a confidence interval around AUC using R’s pROC package, which uses bootstrapping to calculate the interval.

Building a simple model to test

To demonstrate how to get an AUC confidence interval, let’s build a model using a movies dataset from Kaggle (you can get the data here).

Reading in the data


# load packages
library(pROC)
library(dplyr)
library(randomForest)

# read in dataset
movies <- read.csv("movie_metadata.csv")

# remove records with missing budget / gross data
movies <- movies %>% filter(!is.na(budget) & !is.na(gross))

Split into train / test

Next, let’s randomly select 70% of the records to be in the training set and leave the rest for testing.


# get random sample of rows
set.seed(0)
train_rows <- sample(1:nrow(movies), .7 * nrow(movies))

# split data into train / test
train_data <- movies[train_rows,]
test_data <- movies[-train_rows,]

# select only fields we need
train_need <- train_data %>% select(gross, duration, director_facebook_likes, budget, imdb_score, content_rating, movie_title)
test_need <- test_data %>% select(gross, duration, director_facebook_likes, budget, imdb_score, content_rating, movie_title)

Create the label

Lastly, we need to create our label i.e. what we’re trying to predict. Here, we’re going to predict if a movie’s gross beats its budget (1 if so, 0 if not).


train_need$beat_budget <- as.factor(ifelse(train_need$gross > train_need$budget, 1, 0))
test_need$beat_budget <- as.factor(ifelse(test_need$gross > test_need$budget, 1, 0))

Train a random forest

Now, let’s train a simple random forest model with just 50 trees.

train_need <- train_need[complete.cases(train_need),]

# train a random forest
forest <- randomForest(beat_budget ~ duration + director_facebook_likes + budget + imdb_score + content_rating,
                       train_need, ntree = 50)

Getting an AUC confidence interval

Next, let’s use our model to get predictions on the test set.


test_pred <- predict(forest, test_need, type = "prob")[,2]

And now, we’re reading to get our confidence interval! We can do that in just one line of code using the ci.auc function from pROC. By default, this function uses 2000 bootstraps to calculate a 95% confidence interval. This means our 95% confidence interval for the AUC on the test set is between 0.6198 and 0.6822, as can be seen below.


ci.auc(test_need$beat_budget, test_pred) 
# 95% CI: 0.6198-0.6822 (DeLong)

We can adjust the confidence interval using the conf.level parameter:


ci.auc(test_need$beat_budget, test_pred, conf.level = 0.9) 
# 90% CI: 0.6248-0.6772 (DeLong)

That’s it for this post! Please click here to follow this blog on Twitter!

See here to learn more about the pROC package.

Andrew Treadway

Next Python, Basket Analysis, and Pymining »

Previous « Really large numbers in R

Published by

Andrew Treadway

Tags: data sciencemachine learningR

5 years ago

Software Engineering for Data Scientists (New book!)

Very excited to announce the early-access preview (MEAP) of my upcoming book, Software Engineering for…

1 year ago

Python

How to stop long-running code in Python

Ever had long-running code that you don't know when it's going to finish running? If…

2 years ago

Python

Faster alternatives to pandas

Background If you've done any type of data analysis in Python, chances are you've probably…

3 years ago

Python

Automated EDA with Python

In this post, we will investigate the pandas_profiling and sweetviz packages, which can be used…

3 years ago

How to plot XGBoost trees in R

In this post, we're going to cover how to plot XGBoost trees in R. XGBoost…

3 years ago

Python

Python collections tutorial

In this post, we'll discuss the underrated Python collections package, which is part of the…

3 years ago

How to get an AUC confidence interval

Background

Building a simple model to test

Reading in the data

Split into train / test

Create the label

Train a random forest

Getting an AUC confidence interval

Related Post

Recent Posts

Software Engineering for Data Scientists (New book!)

How to stop long-running code in Python

Faster alternatives to pandas

Automated EDA with Python

How to plot XGBoost trees in R

Python collections tutorial