algorithm graph-theory control-flow-graph

Explanation of in/out Live Variable Calculation Algorithm

This slide shows the algorithm for computing in[n] and out[n] for a control-flow graph node. I'm having difficulty understanding how it works though. I have seen a few other variations and also have difficulty understanding them. I have never dealt with fixed point stuff before.

for each n
  in[n] := {}; out[n] = {}
repeat
  for each n
    in’[n] := in[n]; out’[n] := out[n]
    in[n] := use[n] ∪ (out[n] - def[n])
    out[n] := ∪ {in[s] | s ε succ[n]}
  until in’[n] = in[n] and out’[n] = out[n]
  for all n

The question is, what this algorithm is doing / a more intuitive explanation of it. I don't understand what in' and out' are, and what the end condition means (until in'...). Not following the nested loops. My attempt at a JavaScript implementation shows where I am missing pieces:

var in = {}
var out = {}
var in2 = {}
var out2 = {}
var use = {}
var out = {}
var def = {}

for (var i = 0, n = nodes.length; i < n; i++) {
  var node = nodes[i]
  in[node] = []
  out[node] = []
  // assume these are already filled out:
  use[node] = []
  out[node] = []
  def[node] = []
}

while (true) {
  for (var i = 0, n = nodes.length; i < n; i++) {
    var node = nodes[i]
    in2[node] = in[node]
    out2[node] = out[node]
    // assume ∪ and - work on arrays
    in[node] = use[node] ∪ (out[node] - def[node])
    // ? not sure the ∪
    out[node] = ∪ {in[s] | s ε succ[n]}
  }

  // until in’[n] = in[n] and out’[n] = out[n]
  // for all n
}

Any help would be greatly appreciated. Thank you.

Solution

The least fixed-point algorithm applies to situations where you have a finite number of sets whose members come from a finite universe, and where the membership of each set (possibly) depends on the membership of other sets, specifically by including elements from specific other sets.

If the dependencies form a directed acyclic graph (DAG), then there is no problem; the sets can be computed by topologically ordering the sets on the dependency relationship, and then computing the sets in order. (Because of the topological sort, no set depends on a previous set, so by the time the set needs to be computed, all its dependencies have already been computed.)

But if the dependency graph has cycles, a topological sort is not possible, so we do a least fixed-point algorithm instead. We start by setting all the sets to empty, and then process all of them in some order. When we come to a dependency, we just add the elements which happen to be in the dependency at that moment. If any set was modified during this loop, we process all the sets again. (It's not necessary to process them in the same order, but it's usually the easiest way.) And we keep on doing that over and over again until we get through a complete cycle without adding any new element to any set. At this point, we have achieved a consistent set of membership dependencies (a "fixed point") which has no extraneous members (so it's the "least fixed point").

In theory, this algorithm can take a long time, but it must terminate because every cycle involves a fixed number of set computations and (except for the last cycle) adds at least one element to some set. In the worst case, every set includes every element, so there is only a finite number of cycles possible (at most the number of sets multiplied by the number of elements). In practice, for many problem domains, the algorithm runs a lot faster than that, or the product of sets and elements is not too great (or both).

These problems can usually be solved by computing the transitive closure of a relational equation. Since transitive closure algorithms are typically faster (both in terms of theoretical complexity and in the practical execution time of the inner loop), they would be the solution of choice if speed matters. However, the least fixed-point algorithm is easier to understand and the code is somewhat less mysterious.

In the particular algorithm for liveness determination, the set dependency relationships are listed on the previous slide; you can see that each in and out set is defined by some fixed elements and the union of one or more other sets. As shown, the algorithm saves a copy of all the sets at the beginning of a cycle and compares each copy with its corresponding set at the end of the cycyle. If any set has been changed during the cycle, the algorithm has not yet finished.

In practice, it is more common to just set a boolean flag to false at the beginning of the cycle and to true if a union operation causes a new element to be added to a set (which is easy to add to the union operation, but messy to describe in a formal algorithm). Then the algorithm terminates if the boolean is still false at the end of a cycle.