R

Don’t forget the “utils” package in R

With thousands of powerful packages, it’s easy to glaze over the libraries that come preinstalled with R. Thus, this post will talk about some of the cool functions in the utils package, which comes with a standard installation of R. While utils comes with several familiar functions, like read.csv, write.csv, and help, it also contains over 200 other functions.

readClipboard and writeClipboard

One of my favorite duo of functions from utils is readCLipboard and writeClipboard. If you’re doing some manipulation to get a quick answer between R and Excel, these functions can come in handy. readClipboard reads in whatever is currently on the Clipboard.

For example, let’s copy a column of cells from Excel.

We can now run readClipboard() in R. The result of running this command is a vector containing the column of cells we just copied. Each cell corresponds to an element in the vector.

Similarly, if we want to write a vector of elements to the clipboard, we can the writeClipboard command:


test <- c("write", "to", "clipboard")

writeClipboard(test)

Now, the vector test has been copied to the clipboard. If you paste the result in Excel, you’ll see a column of cells corresponding to the vector you just copied.

combn

The combn function is useful for getting the possible combinations of an input vector. For instance, let’s say we want to get all of the possible 2-element combinations of a vector, we could do this:


food <- c("apple", "grape", "orange", "pear", "peach", "banana")

combn(food, 2)

In general, the first parameter of combn is the vector of elements you want to get possible combinations from. The second parameter is the number of elements you want in each combination. So if you need to get all possible 3-element or 4-element combinations, you would just need to change this number to three or four.


combn(food, 3)

combn(food, 4)

We can also add a parameter called simplify to make the function return a list of each combination, rather than giving back a matrix output like above.


combn(food, 3, simplify = FALSE)

fileSnapshot

The fileSnapshot function is one R’s collection of file manipulation functions. To learn more about file manipulation and getting information on files in R, check out this post.

fileSnapshot will list and provide details about the files in a directory. This function returns a list of objects.


# get file snapshot of current directory
snapshot <- fileSnapshot()

# or file snapshot of another directory
snapshot <- fileSnapshot("C:/some/other/directory")

fileSnapshot returns a list, which here we will just call “snapshot”. The most useful piece of information can be garnered from this by referencing “info”:


snapshot$info

Here, snapshot$info is a data frame showing information about the files in the input folder parameter. Its headers include:

  • size ==> size of file
  • isdir ==> is file a directory? ==> TRUE or FALSE
  • mode ==> the file permissions in octal
  • mtime ==> last modified time stamp
  • ctime ==> time stamp created
  • atime ==> time stamp last accessed
  • exe ==> type of executable (or “no” if not an executable)
  • download.file

    download.file does just what it sounds like – downloads a file from the internet to the destination provided in the function’s input. The first parameter is the URL of the file you wish to download. The second parameter is the name you want to give to the downloaded file. Below, we download a file and call it “census_data.csv”.

    
    download.file("https://www2.census.gov/programs-surveys/popest/datasets/2010/2010-eval-estimates/cc-est2010-alldata.csv", "census_data.csv")
    
    

    How to modify an object on the fly with the “fix” function

    The utils package also has the ability to modify objects on the fly with the fix function. For instance, let’s say you define a function interactively, and you want to make some modification.

    
    some_func <- function(num)
    {
        3 * num + 1
    
    }
    
    

    Now, let’s modify the function with fix:

    
    fix(some_func)
    
    

    When you call fix, it comes up with an editor allowing you to modify the definition of the function. You can also call fix to modify a vector or data frame.

    
    fix(iris)
    
    

    That’s it for this post! Please check out my other R posts by clicking here. Also, if you live in the NYC area and are interested in in-person open source coding workshops, please check out my meetup here.

    Andrew Treadway

    Recent Posts

    Software Engineering for Data Scientists (New book!)

    Very excited to announce the early-access preview (MEAP) of my upcoming book, Software Engineering for…

    1 year ago

    How to stop long-running code in Python

    Ever had long-running code that you don't know when it's going to finish running? If…

    2 years ago

    Faster alternatives to pandas

    Background If you've done any type of data analysis in Python, chances are you've probably…

    3 years ago

    Automated EDA with Python

    In this post, we will investigate the pandas_profiling and sweetviz packages, which can be used…

    3 years ago

    How to plot XGBoost trees in R

    In this post, we're going to cover how to plot XGBoost trees in R. XGBoost…

    3 years ago

    Python collections tutorial

    In this post, we'll discuss the underrated Python collections package, which is part of the…

    3 years ago