Suppose I would like to expand a string by replacing placeholders from a dictionary. The replacement strings can also contain placeholders:
$pattern = "#a# #b#";
$dict = array("a" => "foo", "b" => "bar #c#", "c" => "baz");
while($match_count = preg_match_all('/#([^#])+#/', $pattern, $matches)) {
for($i=0; $i<$match_count; $i++) {
$key = $matches[1][$i];
if(!isset($dict[$key])) { throw new Exception("'$key' not found!"); }
$pattern = str_replace($matches[0][$i], $dict[$key], $pattern);
}
}
echo $pattern;
This works fine, as long as there are no circular replacement patterns, for example "c" => "#b#"
. Then the program will be thrown into an endless loop until the memory is exhausted.
Is there an easy way to detect such patterns? I'm looking for a solution where the distance between the replacements can be arbitrarily long, eg. a->b->c->d->f->a
Ideally, the solution would also happen while in the loop and not with a separate analysis.
Thanks to the comment by georg and this post, I came up with a solution that converts the patterns into a graph and uses topological sort to check for circular replacement.
Here is my solution:
$dict = array("a" => "foo", "b" => "bar #c#", "c" => "baz #b#");
# Store incoming and outgoing "connections" for each key => pattern replacement
$nodes = array();
foreach($dict as $patternName => $pattern) {
if (!isset($nodes[$patternName])) {
$nodes[$patternName] = array("in" => array(), "out" => array());
}
$match_count = preg_match_all('/#([^#])+#/', $pattern, $matches);
for ($i=0; $i<$match_count; $i++) {
$key = $matches[1][$i];
if (!isset($dict[$key])) { throw new Exception("'$key' not found!"); }
if (!isset($nodes[$key])) {
$nodes[$key] = array("in" => array(), "out" => array());
}
$nodes[$key]["in"][] = $patternName;
$nodes[$patternName]["out"][] = $key;
}
}
# collect leaf nodes (no incoming connections)
$leafNodes = array();
foreach ($nodes as $key => $connections) {
if (empty($connections["in"])) {
$leafNodes[] = $key;
}
}
# Remove leaf nodes until none are left
while (!empty($leafNodes)) {
$nodeID = array_shift($leafNodes);
foreach ($nodes[$nodeID]["out"] as $outNode) {
$nodes[$outNode]['in'] = array_diff($nodes[$outNode]['in'], array($nodeID));
if (empty($nodes[$outNode]['in'])) {
$leafNodes[] = $outNode;
}
}
$nodes[$nodeID]['out'] = array();
}
# Check for non-leaf nodes. If any are left, there is a circular pattern
foreach ($nodes as $key => $node) {
if (!empty($node["in"]) || !empty($node["out"]) ) {
throw new Exception("Circular replacement pattern for '$key'!");
}
}
# Now we can safely do replacement
$pattern = "#a# #b#";
while ($match_count = preg_match_all('/#([^#])+#/', $pattern, $matches)) {
$key = $matches[1][$i];
$pattern = str_replace($matches[0][$i], $dict[$key], $pattern);
}
echo $pattern;