File Manipulation with Python

Home / File Manipulation / File Manipulation with Python

python file manipulation

Python is great for automating file creation, deletion, and other types of file manipulations.  Two of the primary packages used to perform these types of tasks are os and shutil.  We’ll be covering a few useful highlights from each of these.

import os
import shutil

Batch Folder Creation

If you want to create a handful of folders / directories, it’s not difficult to manually do so.  But creating a few dozen folders manually gets mundane really fast.

The os package contains a method, os.mkdir, that we can use in our situation.

One line of code you might (though not required) want to use before you start is to change your working directory to where you want to create your list of folders:


One problem I’ve seen several times in the past is to create a collection of folders for each state in the US.  We can do this using the us package with os:

from us.states import STATES

'''get list of all US states plus DC'''
all_states = [x.abbr for x in STATES]

'''create folder for each state'''
for state in all_states:

Above, the STATES object from us.states contains attributes about each states.  We get the state abbreviation using the “.abbr” attribute of each element in STATES.

If we wanted to create a list of folders with full state names, rather than abbreviations, we could use the “.name” attribute:

all_states = [ for x in STATES]

A similar problem is to create a collection of folders for the letters of the alphabet.  This might be used to organize files or data related to last names.  Here we’ll use the string package to get the letters of the alphabet.

import string

'''create folder for each letter in the alphabet '''
for letter in string.ascii_uppercase:

Batch Folder Deletion

Next, what if you want to delete all the folders and their contents in a directory? You can do this using the os.unlink method.  Be careful though, because the contents of the folders will be lost without going to the Recycle Bin (if you’re on Windows).

If you run into a “permission denied” error on Windows, you may need to run Python as an administrator.   This can generally be done by right clicking on the icon of the IDE you’re using (or command prompt if you’re running a script from there),  and clicking “Run as administrator.”

for folder in os.listdir():

Above, we used the os.listdir method to list all of the files / folders in a directory.

It’s a relatively common task to want to clean up files that are old.  To accomplish this, we first need to get the age of the files in the directory we want to clean up.

This may be done by either examining the created dates or modified dates of files in a directory.

import time

'''get list of files / folders in directory'''

contents = os.listdir()

'''Get the created time stamp of each file / folder in current directory '''

created_times = [time.ctime(os.path.getctime(elt)) for elt in contents]
created_times = [time.strptime(elt) for elt in created_times]

'''delete all files created prior to 2017'''
for index in range(len(created_times)):

    if created_times[index].tm_year != 2017:

Let’s break this down.

created_times = [time.ctime(os.path.getctime(elt)) for elt in contents]

Here, we’re using the os.path.getctime method to get the created time stamp of each file / folder (or “elt”) in the root of the current directory.

We’re wrapping this method inside of time.ctime to convert the object returned by os.path.getctime to a date string that looks something like this:

‘Wed Aug 4 20:30:24 2017’

This is done because the object returned by os.path.getctime is a float, which represents the number of seconds since January 1, 1970.  To read more about how times are represented in Python, see here.

Once we have a time stamp string for each element in the directory,  we convert each string to a struct_time type, which will allow us to easily parse information about the created dates e.g. the year or month particular files were created.

You can do this by calling the time.strptime method on each element in the created_times list.

created_times = [time.strptime(elt) for elt in created_times]

Lastly, we loop over the indices on created_times, and delete the corresponding file / folder if the created year is prior to 2017.  In other words, if the 5th element in created_times has a time stamp with a year before 2017, then the 5th element in the contents of the directory (stored in the variable contents) gets deleted.

You can get the year of a struct_time type by using the tm_year method:

for index in range(len(created_times)):

    if created_times[index].tm_year != 2017:

So what if we want to delete everything last modified prior to 2017, rather than just created?

We can do almost exactly the same process, except we change os.path.getctime to os.path.getmtime:

mod_times = [time.ctime(os.path.getmtime(elt)) for elt in contents]
mod_times = [time.strptime(elt) for elt in created_times]

for index in range(len(mod_times)):

    if created_times[index].tm_year != 2017:

Moving Files Around

To move files from one directory to another, you can use the shutil package. Let’s assume our directory is still set to C:\Users\USERNAME\Documents, and we want to move every file and sub-directory from this folder to a new destination.

Suppose the new folder is C:\NEW_DEST. We can do this using the shutil.move method:

for file in os.listdir():
    shutil.move(file ,  "C:/NEW_DEST/" + file)

So that’s an introduction to manipulating files with Python. There’s a lot more you can do with the os and shutil packages, and I may have future posts extending this one.