Faster alternatives to pandas
Background If you've done any type of data analysis in Python, chances are you've probably used pandas. Though widely used in the data world, if you've run into space or computational issues with it, you're not alone. This post discusses several faster alternatives to pandas. R's data table in Python If you've used R, you're probably familiar with the data.table package. A port of this library is also available in Python. In this example, we show how you can read in a CSV file faster than using standard pandas. For our purposes, we'll be using an open source dataset from the UCI repository. [code lang="python"] import datatable start = time.time() os_scan_data = datatable.fread("OS Scan_dataset.csv", header = None) end = time.time() print(end - start) [/code] Using datatable, we can read in…