Why are python sets "sorted" in ascending order?

2.2k views Asked by At

Let's run the following code:

st = {3, 1, 2}
st
>>> {1, 2, 3}
st.pop()
>>> 1
st.pop()
>>> 2
st.pop()
>>> 3

Although sets are said to be unordered, this set behaves as if it was sorted in ascending order. The method pop(), that should return an 'arbitrary element', according to the documentation, returns elements in ascending order as well. What is the reason for this?

2

There are 2 answers

1
Bharel On BEST ANSWER

The order correlates to the hash of the object, size of the set, binary representation of the number, insertion order and other implementation parameters. It is completely arbitrary and shouldn't be relied upon:

>>> st = {3, 1, 2,4,9,124124,124124124124,123,12,41,15,}
>>> st
{1, 2, 3, 4, 9, 41, 12, 15, 124124, 123, 124124124124}
>>> st.pop()
1
>>> st.pop()
2
>>> st.pop()
3
>>> st.pop()
4
>>> st.pop()
9
>>> st.pop()
41
>>> st.pop()
12
>>> {1, 41, 12}
{1, 12, 41}
>>> {1, 9, 41, 12}
{1, 12, 9, 41}  # Looks like 9 wants to go after 12.
>>> hash(9)
9
>>> hash(12)
12
>>> hash(41)
41
>>> {1, 2, 3, 4, 9, 41, 12}
{1, 2, 3, 4, 9, 12, 41}  # 12 before 41
>>> {1, 2, 3, 4, 9, 41, 12, 15}  # add 15 at the end
{1, 2, 3, 4, 9, 41, 12, 15}  # 12 after 41
2
Mike Clark On

You are never supposed to rely on a set's ordering. Once you put more than one thing into a set, expect to never be able to rely on a stable ordering between elements. That is the contract that Python needs you to adhere to.

The implementation details may show you behavior that may lead you to believe that you sometimes can predict the ordering of a set. For example, as you have discovered, if you put some low-order descending integers into a set and print the set, they will show up in ascending order. This is probably because the hash of an integer is equal to itself.

hash(1) == 1
hash(2) == 2

(You can make a Python program to confirm this).

An implementation detail of set is that internally it uses a hashing and bucket storage strategy -- the same as Python's dict does. An implementation detail of this is that items are traversed in an ordering based loosely on how the hash codes have been partitioned internally. Thus for some implementations of Python, e.g. CPython, you will sometimes see what looks like deterministic sorting order for integers, because the hash of an integer is so tightly bound to the integer itself (they are the same value), and this interacts with the implementation details of hash traversal.

The contract with set tells you to never rely on this behavior for future-proofing and cross-Python interpreter compatibility.

Another implementation detail is that .pop set will (for some implementations of Python) return the elements of the set in the order which you see the elements. So, if you print a set to see its contents, and then pop all the elements out of that set, they will pop in an order consistent with what you saw when you printed the set.

As an implementation detail, the ordering of a set is "stable" as long as the contents of the set have not changed. You're not supposed to rely on this. Popping is allowed to select an arbitrary element from the set, but not required to.