14Module · Goodrich Ch. 13–14

Week 14 — Text Processing & Graph Algorithms

Pattern matching, tries, dynamic programming on strings; graph ADTs, traversals, shortest paths, and minimum spanning trees.

Text Processing

Pattern-matching algorithms (brute force, KMP, Boyer–Moore), tries, and dynamic-programming approaches like edit distance and LCS.

Theory

Brute-force pattern matching is O(n·m); KMP uses a failure function for O(n + m).

A trie indexes a set of strings by character with O(L) lookup, where L is pattern length — used in spell-checkers and autocomplete.

Edit distance and LCS are textbook dynamic-programming problems with O(n·m) tables.

Key points

  • Suffix tries / suffix arrays answer substring queries in O(m).
  • DP recurrences become straightforward once you define the table dimensions and base cases.

Code examples

Edit distance (Levenshtein)
Edit distance (Levenshtein)
def edit_distance(a, b):
    n, m = len(a), len(b)
    dp = [[0] * (m + 1) for _ in range(n + 1)]
    for i in range(n + 1): dp[i][0] = i
    for j in range(m + 1): dp[0][j] = j
    for i in range(1, n + 1):
        for j in range(1, m + 1):
            cost = 0 if a[i-1] == b[j-1] else 1
            dp[i][j] = min(
                dp[i-1][j] + 1,           # delete
                dp[i][j-1] + 1,           # insert
                dp[i-1][j-1] + cost,      # replace
            )
    return dp[n][m]

Graph Algorithms

Graph ADTs and representations, BFS/DFS, shortest paths (Dijkstra), and minimum spanning trees (Prim, Kruskal).

Theory

A graph is a pair (V, E). Adjacency lists use O(V + E) space and are best for sparse graphs; adjacency matrices use O(V²) and are best for dense graphs.

BFS computes shortest paths in unweighted graphs in O(V + E); DFS classifies edges and discovers topological order on DAGs.

Dijkstra's algorithm computes single-source shortest paths in non-negative weighted graphs in O((V + E) log V) using a min-heap.

Prim and Kruskal each compute a minimum spanning tree in O(E log V).

Key points

  • Dijkstra requires non-negative weights — use Bellman–Ford for negative edges.
  • Kruskal needs a disjoint-set (union–find) structure with path compression.
  • BFS shortest path on an unweighted graph is asymptotically optimal.

Code examples

BFS shortest path
BFS shortest path
from collections import deque

def bfs_distance(adj, src):
    dist = {src: 0}
    q = deque([src])
    while q:
        u = q.popleft()
        for v in adj[u]:
            if v not in dist:
                dist[v] = dist[u] + 1
                q.append(v)
    return dist
Dijkstra with a min-heap
Dijkstra with a min-heap
import heapq

def dijkstra(adj, src):
    dist = {src: 0}
    pq = [(0, src)]
    while pq:
        d, u = heapq.heappop(pq)
        if d > dist[u]: continue
        for v, w in adj[u]:
            nd = d + w
            if nd < dist.get(v, float("inf")):
                dist[v] = nd
                heapq.heappush(pq, (nd, v))
    return dist