Python3.7 dictionary is orderly

  python

Python3.7 dictionary is orderly

Old structure

The dictionary structure before python3.7, the classic and crude hash table implementation, in this way every expansion and contraction of the hash table may cause the hash value to change.

Before and after the hash table capacity update, the relative order between its keys will change, so the elements of the dictionary are out of order. The old structure is similar to the following

--+-------------------------------+
  | 哈希值 (hash)  键 (key)  值 (value)
--+-------------------------------+
0 |    hash0      key0    value0
--+-------------------------------+
1 |    hash1      key1    value1
--+-------------------------------+
2 |    hash2      key2    value2
--+-------------------------------+
. |           ...
__+_______________________________+

New structure

Indices
----------------------------------------------------
None | index0 | None | None | index2 | None | index1 ...
----------------------------------------------------

Entries
--------------------
hash0   key0  value0
---------------------
hash1   key1  value1
---------------------
hash2   key2  value2
---------------------
        ...
---------------------

The new structure consists of Indices (index, array implementation) and Entries (entity, PyDictObject type).

  • When inserting a data, the hash value corresponding to the data is first calculated and mapped into a subscript of the indexes array. if there is no conflict, the other value Entries_index (so called temporarily) is filled into the position corresponding to the subscript in the indexes array. And add a line of records after Entries, similar toHash value, key value, value value. If there is a conflict, the basic solution to the conflict can be used, which will not be repeated here.

This method, dictionaryAdd, delete, change and checkThe time complexity of will change from O(1) to O(2) because there is an extra step of searching. Moreover, when expanding and shrinking the dictionary, the dictionary should be kept in order according to the order of indicators.

But there are at least two optimizations.

  • The dictionary takes up less memory. Old dictionaries always reserve hash positions with a capacity greater than 1/ 3 to prevent too many hash collisions from affecting efficiency. There is no need to reserve it now.
  • The dictionary is in order.

See source code:
dictobject.h

dictobject.c

Note: 2019/07/23