I have an array of messages with their references to parent message(s), it looks something like this:
array (
'm1' => array ('m9'),
'm5' => array ('m3', 'm4', 'm2'),
'm2' => array ('m1'),
'm3' => array ('m2', 'm1', 'm8'),
'm6' => array ('m7'),
'm4' => array ('m3', 'm2'),
)
Keys of this array are message IDs and values are references to zero or more parent IDs (in any order). Order of IDs can be random and it is not guaranteed that referenced parent ID is in given message set.
What I need to do is to group those messages in a 'threaded view'. So basically I need to transform this array into something like this:
array(
'm1' => array('m1', 'm2', 'm3', 'm4', 'm5'),
'm6' => array('m6')
);
Every message should be assigned to a thread grouped by top-level message. Message is considered to be top-level when it has no reference to parent or when referenced parent not exists in set. Messages 'm1' and 'm6' are top-level because 'm9' and 'm7' are not in the given set. Message 'm3' is in a 'm1' thread despite a reference to non-existing 'm8' - it has other existing parents which links it to 'm1'.
My question is how to do it, and how to do it efficiently? Any help would be appreciated.
UPDATE:
What I came up with is to first reverse those relations, so it becomes:
array (
'm9' => array ('m1'), # this would be rejected
'm3' => array ('m5', 'm4'),
'm4' => array ('m5'),
'm2' => array ('m5', 'm3', 'm4'),
'm1' => array ('m2', 'm3'),
'm8' => array ('m3'), # this would be rejected
'm7' => array ('m6'), # this would be rejected
)
Then I would add keys 'm6' and 'm5' with no children as they exist in input keys but not in the transformed array.
Now I have every relation parent => children that can be found in the input data. After comparing keys of this array with input array I can reject keys 'm9', 'm8' and 'm7' as non-existing.
Finally the array would look like this:
array (
'm3' => array ('m5', 'm4'),
'm4' => array ('m5'),
'm2' => array ('m5', 'm3', 'm4'),
'm1' => array ('m2', 'm3'),
'm6' => array(),
'm5' => array()
)
What I need to do now is to somehow flatten this structure. I need to find every parent p1 that is also a child of another parent p2 and append p1 children to p2 children. I don't know how to do it in another way than iterating over and over again those arrays, but that's not an option here.
This seemed like an interesting challenge for me. What I achieved so far:
First of all, one might get rid of orphans:
$a = array (
'm1' => array ('m9'),
'm5' => array ('m3', 'm4', 'm2'),
'm2' => array ('m1'),
'm3' => array ('m2', 'm1', 'm8'),
'm6' => array ('m7'),
'm4' => array ('m3', 'm2'),
);
$f = array_map(function($v) use (&$a) {
$k = key($a); next($a);
$vals = array_filter($v, function($el) use ($a) {
return isset($a[$el]);
});
return empty($vals) ? [$k] : $vals;
}, $a);
The latter gives an array mapping parents to arrays of children.
Assuming you have your favorite array_flatten
function on hand:
function array_flatten($array, $return) {
for($x = 0; $x <= count($array); $x++) {
if(isset($array[$x]) && is_array($array[$x])) {
$return = array_flatten($array[$x], $return);
} else {
if(isset($array[$x])) {
$return[] = $array[$x];
}
}
}
return $return;
}
now we could use the function below to walk up the tree:
function resolve_parents(&$f) {
array_walk($f, function(&$v, $k) use(&$f) {
if(!is_array($v)) { // great! that’s what we needed
$f[$k] = $v;
} else if(count($v) > 1) { // many children left
$f[$k] = array_unique(
array_flatten(
array_map(function($v) use(&$f) {
return $f[$v];
}, $v),
array())
);
} else { // great, one child left, store it as result
$f[$k] = $v[0];
};
});
}
Well, it gives us the resolution for one step up. By running it as many times as needed (check for that there are no array
as values ⇒ everything is resolved):
function check_result($arr) {
return array_reduce($arr, function($memo, $v) {
return $memo = $memo && !is_array($v); }, true);
}
while(!check_result($f)) resolve_parents($f);
we will finally yield an array:
// Array
// (
// [m1] => m1
// [m5] => m1
// [m2] => m1
// [m3] => m1
// [m6] => m6
// [m4] => m1
// )
Which is apparently the answer to your question.
Unfortunately, the way above is neither elegant nor proven to be efficient. I just left it here in case the code might hint you.
If you have any questions, don’t hesitate to ask.