How to hide a password in R with the keyring package

How to hide a password in R with the keyring package

R
This post will introduce using the keyring package to hide a password. Short background The keyring package is a library designed to let you access your operating system's credential store. In essence, it lets you store and retrieve passwords in your operating system, which allows you to avoid having a password in plaintext in an R script. Storing a password Storing a password with keyring is really straightforward. First, we just need to load the keyring package. Then we call a function called key_set_with_value. In this function, we'll input three different parameters - service, username and password. [code lang="R"] # load keyring package library(keyring) # Store email username with password key_set_with_value(service = "user_email", username = "your_address@example.com", password = "test password") [/code] The username and password stored are just that -…
Read More
Does “Sell in May, Go Away” really work?

Does “Sell in May, Go Away” really work?

R
If you follow the stock market, you've probably heard the expression "Sell in May, Go Away." This expression generally refers to the perceived idea that the stock market goes up between the end of October and end of April, but one should sell at the beginning of May to avoid losses. The general recommendation according to the theory is to hold money in a money market account during the "short period" of May through October, and then reinvest in the stock market in November. But how does this myth hold up in reality? Let's use R to find out! Our analysis will look strictly at the S&P 500 performance during the years 1970 to the present (so we won't dive into interest rate levels, money market accounts, etc.). Getting started…
Read More
Four ways to reverse a string in R

Four ways to reverse a string in R

R
R offers several ways to reverse a string, include some base R options. We go through a few of those in this post. We'll also compare the computational time for each method. Reversing a string can be especially useful in bioinformatics (e.g. finding the reverse compliment of a DNA strand). To get started, let's generate a random string of 10 million DNA bases (we can do this with the stringi package as well, but for our purposes here, let's just use base R functions). [code lang="R"] set.seed(1) dna <- paste(sample(c("A", "T", "C", "G"), 10000000, replace = T), collapse = "") [/code] 1) Base R with strsplit and paste One way to reverse a string is to use strsplit with paste. This is the slowest method that will be shown, but…
Read More

Don’t forget the “utils” package in R

R
With thousands of powerful packages, it's easy to glaze over the libraries that come preinstalled with R. Thus, this post will talk about some of the cool functions in the utils package, which comes with a standard installation of R. While utils comes with several familiar functions, like read.csv, write.csv, and help, it also contains over 200 other functions. readClipboard and writeClipboard One of my favorite duo of functions from utils is readCLipboard and writeClipboard. If you're doing some manipulation to get a quick answer between R and Excel, these functions can come in handy. readClipboard reads in whatever is currently on the Clipboard. For example, let's copy a column of cells from Excel. We can now run readClipboard() in R. The result of running this command is a vector…
Read More
Speed Test: Sapply vs. Vectorization

Speed Test: Sapply vs. Vectorization

R
The apply functions in R are awesome (see this post for some lesser known apply functions). However, if you can use pure vectorization, then you'll probably end up making your code run a lot faster than just depending upon functions like sapply and lapply. This is because apply functions like these still rely on looping through elements in a vector or list behind the scenes - one at a time. Vectorization, on the other hand, allows parallel operations under the hood - allowing much faster computation. This posts runs through a couple such examples involving string substitution and fuzzy matching. String substitution For example, let's create a vector that looks like this: test1, test2, test3, test4, ..., test1000000 with one million elements. With sapply, the code to create this would…
Read More
Creating a word cloud on R-bloggers posts

Creating a word cloud on R-bloggers posts

R, Web Scraping
This post will go through how to create a word cloud of article titles scraped from the awesome R-bloggers. Our goal will be to use R's rvest package to search through 50 successive pages on the site for article titles. The stringr and tm packages will be used for string cleaning and for creating a term document frequency matrix (with tm). We will then create a word cloud based off the words comprising these titles. First, we'll load the packages we need. [code lang="R"] # load packages library(rvest) library(stringr) library(tm) library(wordcloud) [/code] Let's write a function that will take a webpage as input and return all the scraped article titles. [code lang="R"] scrape_post_titles <- function(site) { # scrape HTML from input site source_html <- read_html(site) # grab the title attributes…
Read More

How to change a file’s last modified date with R

File Manipulation, R
This relatively quick post goes through how to change a file's last modified date with base R. How to change a file's modified time with R Let's say we have a file, test.txt. What if we want to change the last modified date of the file (let's suppose the file's not that important)? Let's say, for instance, we want to make a file have a last modified date back in the 1980's. We can do that with one line of code. First, let's use file.info to check the current modified date of some file called test.txt. [code lang="R"] file.info("test.txt") [/code] We can see above by looking at mtime that this file was last modified December 4th, 2018. Now, we can use a function called Sys.setFileTime to change the modified date…
Read More
10 R functions for Linux commands and vice-versa

10 R functions for Linux commands and vice-versa

File Manipulation, R, System Administration
This post will go through 10 different Linux commands and their R alternatives. If you're interested in learning more R functions for working with files like some of those below, also check out this post. How to list all the files in a directory Linux R What does it do? ls list.files() Lists all the files in a directory ls -R list.files(recursive = TRUE) Recursively lists all the files in a directory and all sub-directories ls | grep "something" list.files(pattern = "something") Lists all the files in a directory containing the regex "something" R [code lang="R"] list.files("/path/to/directory") list.files("/path/to/do/directory", recursive = TRUE) # search for files containing "something" in their name list.files("/path/to/do/directory", pattern = "something") # search for all CSV files list.files("/path/to/do/directory", pattern = ".csv") [/code] Linux [code lang="bash"] ls /path/to/directory…
Read More
Those “other” apply functions…

Those “other” apply functions…

R
So you know lapply, sapply, and apply...but...what about rapply, vapply, or eapply? These are generally a little less known as far as the apply family of functions in R go, so this post will explore how they work. rapply Let's start with rapply. This function has a couple of different purposes. One is to recursively apply a function to a list. We'll get to that in a moment. The other use of rapply is to a apply a function to only those elements in a list (or columns in a data frame) that belong to a specified class. For example, let's say we have a data frame with a mix of categorical and numeric variables, but we want to evaluate a function only on the numeric variables. Use rapply to…
Read More
How to run R from the Task Scheduler

How to run R from the Task Scheduler

R, System Administration
In a prior post, we covered how to run Python from the Task Scheduler on Windows. This article is similar, but it'll show how to run R from the Task Scheduler, instead. Similar to before, let's first cover how to R from the command line, as knowing this is useful for running it from the Task Scheduler. Running R from the Command Line To open up the command prompt, just press the windows key and search for cmd. When R is installed, it comes with a utility called Rscript. This allows you to run R commands from the command line. If Rscript is in your PATH, then typing Rscript into the command line, and pressing enter, will not result in an error. Otherwise, you might get a message saying "'Rscript'…
Read More