Sets

TL;DR

  • Use the built-in set type when looking for a mutable set.
  • frozenset objects are hashable and can be used as dictionary or set keys.
  • collections.Counter implements multiset or “bag” data structures.

Overview

A set is an unordered collection of objects that does NOT allow duplicate elements.

Typically, sets are used

  • to quickly test a value for membership in the set
  • to insert or delete new values from a set
  • to compute the union or intersection of two sets

In a “proper” set implementation, membership tests are expected to run in fast O(1) time. Union, intersection, difference, and subset operations should take O(n) time on average.

Sets in Python

In Python, the curly-braces set expression syntax and set comprehensions allow you to conveniently define new set instances:

vowels = {'a', 'e', 'i', 'o', 'u'}
squares = {x * x for x in range(10)}
To create an empty set you’ll need to call the set() constructor. Using empty curly-braces {} is ambiguous and will create an empty dictionary instead.

set – Your Go-To Set

This is the built-in set implementation in Python. The set type is mutable and allows for the dynamic insertion and deletion of elements.

Example

>>> vowels = {'a', 'e', 'i', 'o', 'u'}
>>> 'e' in vowels
True

>>> letters = set('alice')
>>> letters.intersection(vowels)
{'a', 'e', 'i'}

>>> vowels.add('x')
>>> vowels
{'i', 'a', 'u', 'o', 'x', 'e'}

>>> len(vowels)
6

frozenset – Immutable Sets

  • An immutable version of set that cannot be changed after it has been constructed.
  • Frozensets are static and only allow query operations on their elements (no inserts or deletions.)
  • As frozensets are static and hashable, they can be used as dictionary keys or as elements of another set.

Example

>>> vowels = frozenset({'a', 'e', 'i', 'o', 'u'}) 
>>> vowels.add('p')
AttributeError:
"'frozenset' object has no attribute 'add'"

# Frozensets are hashable and can
# be used as dictionary keys:
>>> d = { frozenset({1, 2, 3}): 'hello' }
>>> d[frozenset({1, 2, 3})]
'hello'

collections.Counter – Multisets

The collections.Counter class implements a multiset (or bag) type that allows elements in the set to have more than one occurrence.

$\rightarrow$ This is useful if you need to keep track of not only if an element is part of a set, but also how many times it is included in the set

Example

>>> from collections import Counter
>>> inventory = Counter()

>>> loot = {'sword': 1, 'bread': 3}
>>> inventory.update(loot)
>>> inventory
Counter({'bread': 3, 'sword': 1})

>>> more_loot = {'sword': 1, 'apple': 1}
>>> inventory.update(more_loot)
>>> inventory
Counter({'bread': 3, 'sword': 2, 'apple': 1})

You’ll want to be careful when counting the number of elements in a Counter object.

  • Calling len() returns the number of unique elements in the multiset
  • Calling sum() returns the total number of elements
>>> len(inventory)
3  # Number of unique elements

>>> sum(inventory.values())
6  # Total number of elements