10Module · Goodrich Ch. 10

Week 10 — Maps, Hash Tables & Skip Lists

Hash functions, collision handling, rehashing, sorted maps, skip lists, and sets/multisets.

Maps, Hash Tables & Skip Lists

The map ADT, hash functions and collision schemes, load factors, sorted maps, and probabilistic skip lists.

Theory

A map stores (key, value) entries. A hash table achieves expected O(1) lookup by mapping keys to bucket indices.

Collision strategies: separate chaining (each bucket holds a list) and open addressing (linear, quadratic, or double hashing).

The load factor α = n / m controls performance; rehash to a larger table when α exceeds a threshold (~0.5 for open addressing, ~0.75 for chaining).

Skip lists achieve expected O(log n) search using randomized levels — simpler to implement than balanced BSTs.

Key points

  • A good hash function spreads keys uniformly across buckets.
  • Python dict uses open addressing with perturbation probing.
  • Sorted maps support range queries that hash tables cannot.

Code examples

Hash map with separate chaining
Hash map with separate chaining
class ChainHashMap:
    def __init__(self, cap=11):
        self._table = [[] for _ in range(cap)]
        self._n = 0

    def _bucket(self, key):
        return self._table[hash(key) % len(self._table)]

    def __setitem__(self, key, value):
        bucket = self._bucket(key)
        for i, (k, _) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)
                return
        bucket.append((key, value))
        self._n += 1

    def __getitem__(self, key):
        for k, v in self._bucket(key):
            if k == key:
                return v
        raise KeyError(key)
Frequency map with collections.Counter
Frequency map with collections.Counter
from collections import Counter

words = "to be or not to be".split()
freq = Counter(words)
print(freq.most_common(2))  # [('to', 2), ('be', 2)]