You've successfully subscribed to Better Data Science
Great! Next, complete checkout for full access to Better Data Science
Welcome back! You've successfully signed in
Success! Your account is fully activated, you now have access to all content.

Python Set Difference - A Complete Beginner Guide

Python Set Difference - A Complete Beginner Guide

In the last week's article, you've learned in depth how Python set union() works. This week we'll explore yet another set function, and that's set difference(). With Python set difference, you can easily find the difference between two or more sets. In plain English, that means only the distinct values that are unique to the first set are returned.

You'll get a much more in-depth understanding in this article, so continue reading.

Or don't read at all - I've covered the topic in a video format:


Python Set Difference - The Basics

So, what is Python set difference? That's what we'll answer in this section. You'll get a complete understanding of the definition, syntax, and return values through visual examples.

Definition and usage

Set difference function returns the element(s) of the first set that aren't found in the second set. You can find the difference between multiple sets - the same logic applies. For simplicity's sake, we'll work with two in the examples below.

Take a look at the following two sets - A and B:

Image 1 - Two sets with programming languages (image by author)

Calculating a difference between these sets means we'll get a new set with a single element - PHP. Why? Because it's the only element of set A that isn't found in set B:

Image 2 - Set difference in action (image by author)

Similarly, B - A would result in Ruby, as that element is specific to set B. Python set difference is oftentimes represented with a Venn diagram. Here's what it looks like:

Image 3 - Set difference as a Venn diagram (image by author)

Elements Python and JavaScript (JS) are common to both sets. We only care about unique elements from the first set when calculating the set difference - that's why only PHP is returned in the new set.

What does difference method do in Python and how do you find the difference in sets in Python? Let's go over the syntax to answer that question.

Syntax

# Difference between two sets
set1.difference(set2)

# Difference between multiple sets
set1.difference(set2, set3, ...)

Where:

  • set1 - The iterable to find difference from.
  • set2, set3 - Other sets use to "disqualify" elements from set1

Return value

The difference function returns a new set which is the difference between the first set and all other sets passed as arguments - but only if set(s) or iterable object(s) were passed to the function.

If no arguments were passed into the difference() function, a copy of the set is returned.


Python Set Difference Function Example

We'll declare two sets, just as on Image 1:

  • A: Contains Python, JavaScript, and PHP
  • B: Contains Python, JavaScript, and Ruby

As you can see, the first two languages are present in both sets. Calculating the difference as A - B should return a new set with only PHP. Likewise, B - A returns a new set with only Ruby:

A = {'Python', 'JavaScript', 'PHP'}
B = {'JavaScript', 'Python', 'Ruby'}

print(f"A - B = {A.difference(B)}")
print(f"B - A = {B.difference(A)}")

Output:

A - B = {'PHP'}
B - A = {'Ruby'}

If you don't specify any parameters to the difference function, a copy of the set is returned:

print(f"A - B = {A.difference()}")

Output:

A - B = {'JavaScript', 'PHP', 'Python'}

You can verify it was copied by printing the memory address:

A = {'Python', 'JavaScript', 'PHP'}
A_copy = A.difference()

print(hex(id(A)))
print(hex(id(A_copy)))

Output:

0x1107d3f20
0x1107d3d60

You won't see the identical values, and that's not the point. The important thing is that they're different, indicating the set was copied to a different memory address.

Let's now explore a shorter way to get the set difference - by using the minus operator.


Python Set Difference Using The - Operator

You don't have to call the difference() function every time. You can use the minus (-) operator instead:

A = {'Python', 'JavaScript', 'PHP'}
B = {'JavaScript', 'Python', 'Ruby'}

print(f"A - B = {A - B}")

Output:

A - B = {'PHP'}

Everything else remains the same. Just remember that both operands must be of type set.


Python Set Difference Common Errors

You're likely to encounter errors when you first start working with sets. These are common, but usually easy to debug.

AttributeError: 'list' object has no attribute 'difference'

This is the most common type of error and it occurs when you try to call the set difference() function on the wrong data type. Only sets have access to this function.

Here's an example - an exception is raised if you use lists:

A = ['Python', 'JavaScript', 'PHP']
B = ['JavaScript', 'Python', 'Ruby']

print(f"A - B = {A.difference(B)}")

Output:

Image 4 - No attribute error (image by author)

Make sure both are of type set and you'll be good to go.

TypeError: unsupported operand type(s) for -: 'set' and 'list'

This error occurs when you try to use shorthand notation (minus sign) on invalid data types. Both must be sets for the minus sign to work. Here's an example:

A = {'Python', 'JavaScript', 'PHP'}
B = ['JavaScript', 'Python', 'Ruby']

print(f"A - B = {A - B}")

Output:

Image 5 - Unsupported operand types error (image by author)

As you can see, A is a set, and B is a list, so the minus sign doesn't work.


Python Set Difference FAQ

We'll now go over a couple of frequently asked questions (FAQ) regarding Python sets and Python set difference function.

What does Python set() do?

The set() method in Python is used to convert any iterable data type to an element with distinct elements - set.

Can sets have duplicates?

Sets are collections in which repetition and order are ignored - so no, sets can't have duplicates.

Is the set difference operator in Python commutative?

Set difference is not commutative - A - B is not the same as B - A. Here's an example:

A = {1, 2, 3}
B = {3, 4, 5}

print(f"A - B = {A.difference(B)}")
print(f"B - A = {B.difference(A)}")

Output:

A - B = {1, 2}
B - A = {4, 5}

Conclusion

Python set difference is utterly simple to understand. We went through the intuition and definition and built our way towards understanding more advanced usage and typical errors you're bound to see at some point. You have to admit - it was easier than you expected.

I hope that this article has helped you develop a better understanding of the Python set union function. As always, if you have any questions or comments, please feel free to ask in the comment section below. Happy coding!

Learn More

Stay connected