3 recommended books on learning R

3 recommended books on learning R

R
I sometimes get asked how I got started learning R. I thought I would use this post to go through a few books I read along the way which have been highly useful. The Art of R Programming The Art of R Programming: A Tour of Statistical Software Design is one of the first R books I read. If you read the table of contents of this book, you'll see it doesn't cover much data science-related content. However, the book is great at covering the main data structures you need to actually program in R. You'll learn the ins and outs of vectors, data frames, matrices, lists, and so on. Another point I like about the book is that it's good at explaining the primary structures that you need to…
Read More
How is information gain calculated?

How is information gain calculated?

Machine Learning, R
This post will explore the mathematics behind information gain. We'll start with the base intuition behind information gain, but then explain why it has the calculation that it does. What is information gain? Information gain is a measure frequently used in decision trees to determine which variable to split the input dataset on at each step in the tree. Before we formally define this measure we need to first understand the concept of entropy. Entropy measures the amount of information or uncertainty in a variable's possible values. How to calculate entropy Entropy of a random variable X is given by the following formula: -Σi[p(Xi) * log2(p(Xi))] Here, each Xi represents each possible (ith) value of X. p(xi) is the probability of a particular (the ith) possible value of X. Why…
Read More
Handling dates with Python’s maya package

Handling dates with Python’s maya package

Python
Background In this package we'll discuss Python's maya package for parsing dates from strings. A previous article talked about the dateutil and dateparser libraries for finding dates in strings. maya is really great for standardizing variations in a field or list of dates. maya can be installed using pip: pip install maya Standardizing dates with maya Let's start with a basic example. First, we just need to import maya. Next, we'll use its parse method to convert the text into a MayaDT object. We can append the datetime method to this to get a datetime from the string. [code lang="python"] import maya maya.parse("march 1 2019").datetime() [/code] [code lang="python"] maya.parse("9th of february 2019").datetime() [/code] [code lang="python"] maya.parse("1/1/2020").datetime() [/code] Below are several more examples of different date variations. [code lang="python"] maya.parse("1-1-2020").datetime() maya.parse("1…
Read More