I'm curious if anyone could explain a branchless binary search implementation to me. I saw it mentioned in a recent question but I can't imagine how it would be implemented. I assume it could be useful to avoid branches if the number of items is quite large.
I'm going to assume you're talking about the sentence "Make a static const
array of all the perfect squares in the domain you want to support, and perform a fast branchless binary search on it." found in this answer.
A "branchless" binary search is basically just an unrolled binary search loop. This only works if you know in advance the number of items in the array you're searching (as you would if it's static const
). You can write a program to write the unrolled code if it's too long to do by hand.
Then, you must benchmark your solution to see whether it really is faster than a loop. If your branchless code is too big, it won't fit inside the CPU's fast instruction cache and will take longer to run than the equivalent loop.