See also my tutorials on lists and list comprehensions.
A set in Python is an unordered collection of unique elements. Sets are mutable and iterable (more on these properties later). Sets are useful for when dealing with a unique collection of elements – e.g. finding the unique elements within a list to determine if there are are any values which should be present. The operations built around sets are also handy when you need to perform mathematical set-like operations. For example, how would you figure out the common elements between two lists? Or what elements are in one list, but not another? With sets, it’s easy!
We can define a set using curly braces, similar to how we define dictionaries.
nums = {10, 20, 30, 40, 50}
We can confirm that nums is indeed a set by using the type function:
type(nums)
We can also convert a list to a set like this:
some_list = [1, 2, 3, 4, 5] some_set = set(some_list)
By wrapping our list inside set, we can coerce our list into a set. Also, since sets contain only unique elements (e.g. no duplicates), if we convert a list containing duplicate values to a set, the set will give us just the unique elements.
dup_values = [3, 4, 4, 4, 3, 3, 3, 4, 4, 5] set(dup_values)
We can test if two sets are equal using the normal “==” operator.
other_nums = {30, 10, 50, 40, 20} other_nums == nums # returns True nums == {20, 30} # returns False
Notice – since sets are unordered, it doesn’t matter that the set other_nums looks like its elements are in a different order. Set equality comparison just takes into account if the two sets share the exact same elements, regardless of order. And that brings us to the next point – there’s no indexing with sets. This means there’s no such thing as the “first” element or “second element”, etc. in a set (unlike a list). Each set is a just a container of unique, unordered elements.
To combine, or take the union of sets, we use the union method:
others = {60, 70, 80} combined = nums.union(others)
To add an element (or elements) to a set, we use the add method.
# add 60 to nums set nums.add(60) # add 70 to nums set nums.add(70)
To get the intersection, or what common elements exist between two sets, we use the intersection method.
nums.intersection(others)
Now, what if we want to find the common elements between two lists, rather than sets? All we have to do is convert our lists to sets, and we can use the same intersection method.
list1 = [4, 5, 5, 6, 7] list2 = [5, 7, 8] set(list).intersection(set(list2)) # returns {5, 7}
We can get the difference between two sets (i.e. what elements exist in one set, but not the other) using the difference method.
nums.difference(others)
Just like getting finding the common elements between lists with the intersection method, we can find what elements belong in one list, but not another by converting the lists into sets and applying the difference method. For example, the below code will return {1, 2} because 1 and 2 are each present in list1, but not list2.
list1 = [1, 2, 3, 4] list2 = [3, 4, 5] set(list1).difference(set(list2)) # returns {1, 2}
The symmetric difference between two sets, A and B, are the set of elements that exist only in A, or only in B. We can illustrate with our example sets.
nums.symmetric_difference(others)
Using the add method above, we can add one element at a time to a set. But what if we want to add a full set of other elements at once? We can do that using the update method. This method is inplace, which means we don’t need to assign a variable to the new result. Instead, calling nums.update(SOME SET GOES HERE) will update the value of nums to include whatever elements are in the input set. We can see this in the example below by updating nums to include the elements in the letters set that we define.
letters = {"a", "b", "c"} nums.update(letters)
If we want to take out these new elements, we can just use the previously mentioned difference method.
nums = nums.difference(letters)
Python has several Boolean methods available for sets. For instance, we can check if two sets are disjoint (i.e. they do not share any common elements) by using the isdisjoint method.
nums.isdisjoint(letters) # returns True
{"a", "d"}.isdisjoint(letters) # returns False
sample = {"a", "c"} sample.issubset(letters) # returns True sample.issubset(nums) # returns False
Similarly, we can also test if a set is a super set of another (i.e. A is a super set of B if every element in B is also in A, which also means that B must be a subset of A). In this case we use the superset method.
sample = {"a", "c"} letters.issuperset(sample) # returns True letters.issuperset(nums) # returns False
Removing a single element from a set can be done a couple of ways. One way is using the discard method. This method will not raise an error if the element to be removed is not actually present in the set. The discard method is inplace.
mixture = {"a", "b", 1, 2, 3} # after running this line # mixture no longer contains "a" mixture.discard("a") # nothing happens to mixture # since the string "something" is not present # in the set mixture.discard("something")
A single element can also be removed from a set using the remove method. The difference here is that the remove method will raise an error if the element to be removed is not present in the set.
# remove "b" from set mixture.remove("b") # try to remove "something" # from set mixture.remove("something") # error occurs
Mutable means that an object can be changed after it is instantiated. Thus, if we create a set, we can change its contents without formally re-defining the set. We’ve already seen examples of this above. For example, consider again adding an element to a set.
# add 60 to nums set nums.add(60) # add 70 to nums set nums.add(70)
Here, we didn’t have to re-define nums. We’re allowed to add an element to the set we’ve already created because of the fact that sets are mutable. If sets were not mutable, we would not be able to do this. To learn more about mutability, see here.
Sets are also iterable, which means we can loop over the elements of any set.
test = {4, 5, 6, 7, 8} # print each element in test for num in test: print(num)
Python also supports set comprehensions, which are very similar to list comprhensions, except they use sets instead of lists for their base construct. Check out the tutorial on list comprehensions, and you’ll be able to apply the same concepts to set comprehensions.
Set comprehensions allow you to create a set by looping through either another set or some other object. For example, below we’re able to create a set where every element is two times the elements in the original set, test. This is done my looping through each element in test (calling each element here “num”), and returning num * 2 i.e. the element times two.
test = {4, 5, 6, 7, 8} # multiply every element in test by 2 {num * 2 for num in test}
Just like list comprehensions, set comprehensions also support looping and filtering in one step. Below we loop through each element in test and output that element to a new set if the element is less than 6 (“num < 6”).
# get the subset of test where each number is less than 6 {num for num in test if num < 6}
That’s it for this post. Please check out my other Python posts here.
TheAutomatic.net is now on Twitter! Please follow it here.
Very excited to announce the early-access preview (MEAP) of my upcoming book, Software Engineering for…
Ever had long-running code that you don't know when it's going to finish running? If…
Background If you've done any type of data analysis in Python, chances are you've probably…
In this post, we will investigate the pandas_profiling and sweetviz packages, which can be used…
In this post, we're going to cover how to plot XGBoost trees in R. XGBoost…
In this post, we'll discuss the underrated Python collections package, which is part of the…