Functional specification
In a sorted array (or vector) find the longest sequences of repeated values. The size of array is expected to be relatively large (up to a million of entries).
Please see the fully functional demo.
Reservation
Just to avoid the first optimization suggestion: actual code uses my implementation of Incremental upper bound in sorted range instead of std::upper_bound
; just don't want to overcomplicate things here.
Question
Is there a way to express this in terms of C++ Standard Library algorithms using one (or a simple combination of them) to avoid reinventing the wheel? Something like simple one-line solution.
I really don’t like this code because of two loops with two passes and the ugly if-else-if, but I don’t see a way to improve it easily.
I still hope some std
algorithm or their combination could do this work much simpler.
Since many people have focused on the code, please note that it is provided only to explain in code what I want to have in functional specification. Don't focus much on it. And if you are ready to suggest improvements, please note that performance if not a key, but matters; so, please, don't suggest solutions with collecting large amount of data in maps or other dynamic structures. My question only about simple one-line solution existence.
The code
#include <algorithm>
#include <iostream>
#include <vector>
std::pair<std::vector<size_t>,size_t> get_longest_ranges(auto first, auto last)
{
ptrdiff_t max_range = 0;
size_t amount_of_ranges = 0;
auto it = first;
while (it != last) {
auto it_end = std::upper_bound(it, last, *it);
if (max_range == it_end - it) {
++amount_of_ranges;
} else if (max_range < it_end - it) {
max_range = it_end - it;
amount_of_ranges = 1;
}
it = it_end;
}
std::vector<size_t> ranges;
ranges.reserve(amount_of_ranges);
it = first;
while (it != last) {
auto it_end = std::upper_bound(it, last, *it);
if (it_end - it == max_range) {
ranges.push_back(it-first);
}
it = it_end;
}
return { ranges, max_range };
}
int main() {
std::vector<int> v = { 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6, 6, 7, 7, 7, 7 };
auto [starts, length] = get_longest_ranges(v.begin(),v.end());
std::cout << "Max equal elements range length = " << length;
for (auto start : starts) {
std::cout << "\nAt position: " << start << " Values: ";
for (size_t i = start; i < start + length; i++) {
std::cout << v[i] << " ";
}
}
}
The expected result
For the code above the expected result is:
Max equal elements range length = 4
At position: 3 Values: 2 2 2 2
At position: 15 Values: 7 7 7 7
The algorithm that comes in handy here is std::equal_range
.
Note that you do not need to pass the input vector twice. One pass is sufficient to find all subsequences of longest size.
#include <iostream>
#include <vector>
#include <algorithm>
int main (){
std::vector<int> v{ 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 5, 6, 6, 6, 7, 7, 7, 7 };
auto begin = v.begin();
auto end = v.end();
int longest = 0;
auto longestit = begin;
std::vector<std::vector<int>::iterator> m;
while (begin != v.end()) {
auto res = std::equal_range(begin,end,*begin);
int size = std::distance(res.first,res.second);
if (size > longest) {
longest = size;
m.clear();
m.push_back(res.first);
} else if (size == longest) {
m.push_back(res.first);
}
begin = res.second;
}
std::cout << longest << "\n";
for (const auto& v : m){ std::cout << *v << " "; }
}
The if-else is just the minimum needed to find the maxium while iterating and keeping track of all max elements. The rest is standard algorithms as requested.
One could argue that this would be a misuse of std::equal_range
, because the code isn't actually using its full capabilities, while std::find
or others could do the job as well. However, remember that many algorithms be customized to a point where their name becomes misleading and this isn't the case here. I think here std::equal_range
explains quite well what the loop body is doing here.