Finding the dominators of a node is an important first step for numerous algorithms in program analysis. Dominators are used for building an SSA form of a program, eliminating re-computations of expressions, and determining control dependencies between basic blocks.

Background

Node A dominates node B if all paths from the entry node to node B must pass through node A.

E n t r D y A C B

For example, in the above graph, node A dominates node B but does not dominate nodes C or D. All paths to B must go through A: entry, A, B or entry, D, C, A, B. However, there exist paths to C and D that do not go through A: entry, D and entry, D, C.

Trivially, a node is defined as dominating itself. All paths to a node must at some point go through that node so you can consider a node as dominating itself. That doesn’t really give any useful information though.

A strict dominator is just any dominator of a node that is not the node being considered. Additionally, an immediate dominator is the closest strict dominator of a node. The technical definition is that A is the immediate dominator of B if and only if A strictly dominates B and does not dominate any other node that strictly dominates B.

E n t r y A B C

In the above graph, entry, A, and B all dominator C. Only B is the immediate dominator of C.

Finding Dominators

There is a very simple algorithm for finding all of the dominators of all nodes in a graph. The pseudocode is roughly:

dominators = dict()

for node in graph:
  dominators[node] = set(all nodes)

dominators[entry] = {entry}

changed = True
while changed:
  changed = False
  for node in graph:
    new_dom_set = dominators[node]
    for pred in node:
      new_dom_set = dominators[pred] and new_dom_set
    new_dom_set.add(node)
    if new_dom_set != dominators[node]:
      changed = True
      dominators[node] = new_dom_set

The general idea is that the dominators of a node is the intersection of all the dominators of the nodes predecessor and the node itself. You can iteratively find that by setting the dominators of every node to be every node in the graph and then updating the dominators of a node to be the intersection of the dominators of its predecessor unioned with the node. A very important part of the algorithm to remember is that the entry node needs to be explicitly set to have only itself as its sole dominator or else no changes will ever happen because every intersection will just be the intersection of all nodes.

This algorithm is easy to understand and implement but is pretty poor in terms of runtime performance. I’m not sure of the exact runtime of this algorithm but I believe that it is at least n squared and might be as slow as n^4. A more efficient algorithm needs to more effectively iterate over the graph and be more judicious with what kind of information is tracked.

Cooper Harvey Kennedy

Immediate dominators make a much more space efficient way of storing the dominators of a node. If you know the immediate dominator of every node in the graph, then you can just walk that map from a node to get the set of all dominators of a node. Just start the set out as the single node then keep add immediate dominators until the root node is reached.

Instead of trying to calculate the full dominator set for every node, the algorithm can just focus on finding the immediate dominator of each node which reduces the amount of computation that needs to be done. The full dominator set can be computed afterwards for relatively cheap.

Another optimization is to use a reverse postorder traversal of the graph to iterate through each node. A postorder traversal for a DAG is an order such that for any two nodes A and B, if there is a path from A to B then A comes after B in the ordering. A reverse postordering then has the property that if there is a Path from A to B then A comes before B in the ordering.

Reverse postorder is a more effective way to iterate through nodes because determining the dominators of a node requires knowledge of the dominators of its predecessors. So you always want to look at a nodes predecessors before looking at a node. Reverse postorder will guarantee this for a DAG but for a general graph that is not guaranteed.

Now, we’re ready to look at the pseudocode for the algorithm.

for all nodes, b /* initialize the dominators array */
  doms[b] ← Undefined
doms[start node] ← start node
Changed ← true
while (Changed)
  Changed ← false
  for all nodes, b, in reverse postorder (except start node)
    new idom ← first (processed) predecessor of b /* (pick one) */
    for all other predecessors, p, of b
      if doms[p] != Undefined /* i.e., if doms[p] already calculated */
        new idom ← intersect(p, new idom)
  if doms[b] != new idom
    doms[b] ← new idom
    Changed ← true

function intersect(b1, b2) returns node
  finger1 ← b1
  finger2 ← b2
  while (finger1 != finger2)
    while (finger1 < finger2)
      finger1 = doms[finger1]
    while (finger2 < finger1)
      finger2 = doms[finger2]
  return finger1

Reproduced from Cooper, Harvey, and Kennedy1

This is very similar to the naive algorithm except for the reverse postorder traversal of the nodes and the use of a custom intersection function instead of the standard set intersection.

Instead of initializing dominators to every node in the graph, the immediate dominators are initialized to be an invalid value and just the entry node is initialized to have itself as its immediate dominator. The algorithm will then slowly update the nodes to have valid dominators and make the immediate dominator more accurate.

The custom intersection function is a way to find the intersection of two immediate dominators. In the paper, each node is identified by its postorder index. So increasing the node id is finding a node from earlier in the graph. The intersection function is just iterating through immediate dominators until a common immediate dominator node is found. If b1 has a larger value than b2, then b2 is from later in the graph and its immediate dominators should be iterated until a node equal to or larger than b1 is found.

These optimizations bring down the runtime to O(N + E · D) where N is the number of nodes, E the number of edges, and D is the number of nodes in the largest set of dominators for a node.

Implementation

I’ve implemented this algorithm here if you are interested. It is a very faithful recreation of the algorithm. The only change that I made was to use the reverse postorder index to refer to each of the nodes in the graph. So the comparisons in the intersection function are flipped but have the same meaning.


  1. Cooper, Keith D., Timothy J. Harvey, and Ken Kennedy. “A simple, fast dominance algorithm.” Software Practice & Experience 4.1-10 (2001): 1-8. ↩︎