Recently I've observed a behavior of PHP usort that I would like to share for getting an explanation on why this is happening.
When I tried to usort an array and the key used for sorting is not present in any of the array elements, the order is kept only until the array has less or equal to 16 elements.
If the array has more than 17 elements (including), the usort functions arrays' order is not kept.
I've made a code so you can test it yourself on w3schools.
https://www.w3schools.com/php/phptryit.asp?filename=tryphp_func_usort
<!DOCTYPE html>
<html>
<body>
<?php
function my_sort($a, $b) {
return (int)$a['sort_order'] - (int)$b['sort_order'];
}
//populating test array
$array = array();
for ($i=0; $i < 18; $i++) {
$array[$i] = array(
'title' => $i
);
}
foreach ($array as $key => $value) {
echo $key . "=>".$value['title'];
echo "<br>";
}
echo "<br>";
echo "<br>";
usort($array,"my_sort");
foreach ($array as $key => $value) {
echo $key . "=>".$value['title'];
echo "<br>";
}
echo "<br>";
echo "<br>";
echo "<br>";
echo "LESS THAN 17 Elements order is maintened";
echo "<br>";
echo "<br>";
echo "<br>";
$array = array();
for ($i=0; $i < 16; $i++) {
$array[$i] = array(
'title' => $i
);
}
foreach ($array as $key => $value) {
echo $key . "=>".$value['title'];
echo "<br>";
}
echo "<br>";
echo "<br>";
usort($array,"my_sort");
foreach ($array as $key => $value) {
echo $key . "=>".$value['title'];
echo "<br>";
}
?>
</body>
</html>
Is there any possible explanation for this behaviour?
Tested on PHP 7.3 and 8.4, same behavior.
As explained in the comments, notably by @Progman, if you rely on undefined behavior, which was the case on PHP < 8, any result is the "intended" result (even if it doesn't seem logical to you).
But in fact the documentation is kind enough to repeat it explicitely:
Note:
If two members compare as equal, they retain their original order. Prior to PHP 8.0.0, their relative order in the sorted array was undefined.
In computer sciences, "undefined" rarely means "totally random", rather there is some deterministic algorithm behind its implementation, whose logics is different than yours (for example, it may have determined that memory usage benefited from relocating array cells starting from the center before sorting… or that the developer's time was too precious to devote time to the special case where you ask to sort an unsortable array).
To understand the implementation, no other way than reading the source, where you'll probably find the bits explaining why.
And of course once your curiosity is satisfied you'll not rely at all on this knowledge because you've understood that this implementation can change any time.
(in fact I think that the pre-8 version of the aforementioned doc already stated it explicitely)
OK, let's says that by pure curiosity we want to understand why 16, and that we have time to go through the PHP engine source code.
Let's start by uncompressing two copies of the source, one in 7 and one in 8, if possible the nearest version numbers (last 7 and first 8); I have 7.4.33 and 8.1.31 already downloaded, so go with it.
> find php-* -name "*sort*.c"
php-7.4.33/main/mergesort.c
php-7.4.33/ext/intl/collator/collator_sort.c
php-7.4.33/Zend/zend_sort.c
php-8.1.31/ext/intl/collator/collator_sort.c
php-8.1.31/Zend/zend_sort.c
Apart from the uninteresting collator
(relates to string sorting), notice how PHP 7 has two files for sorting while PHP 8 has one.
In other terms: PHP 8 unified its sorting methods (probably to plug every one on the stable sort that can now be proudly announced on usort
's documentation)
Now let's look at mergesort.c
as it was last seen, notably at line 49 (the 5th line if we skip comments):
#define THRESHOLD 16 /* Best choice for natural merge cut-off. */
There you have your 16
which was chosen as a statistical efficiency limit, between