I'm using std::multimap
with tens of millions of elements. The key is a pointer, and the data is a shared pointer to a class. Sometimes I need to remove a few elements, sometimes many elements, and sometimes all elements. I'm currently using std::erase_if
and I'm wondering about performance for the case where all elements are removed.
The first part of the question is: If all elements are being removed, would clear
be substantially faster than erase_if
?
I'm guessing the answer is "yes," which leads to the next question. A problem is that, at the point of the erase_if
call, I don't know whether or not all elements will be removed. So, I could do something like count_if
first, and if the count equals the map size then do clear
, and otherwise do the erase_if
. When removing a few elements, I'm guessing the use of count_if
followed by erase_if
would roughly double the time, which I think would be ok. But when removing all elements, how much would the count_if
reduce the gain of using clear
?
(I'm using gcc 12.3.0.)
If all elements are being removed, would
clear
be substantially faster thanerase_if
?
The relative speed depends on the implementation of the standard library used by your compiler and on your predicate. It is reasonable (not guaranteed, but reasonable) to assume that clear
will not be slower than any other approach for emptying a container. Maybe faster, maybe not.
One such "other" approach would be using erase_if
and a predicate that always returns truth, as in
std::erase_if(map, [](auto){ return true; })
You know your predicate; is your predicate substantially slower than return true
? When invoked once for each element of your container? If so, then erase_if
with your predicate will be substantially slower than erase_if
with an always-true predicate, hence (by transitivity) substantially slower than clear
. If not, then (since both clear
and erase_if
have linear complexity) the difference depends on implementation details, which cannot be predicted theoretically; the difference must be measured to be known.
But when removing all elements, how much would the
count_if
reduce the gain of usingclear
?
The gain of using clear
is the cost of invoking your predicate N times (where N is the size of your container) plus an implementation-dependent amount. The cost of count_if
includes invoking your predicate N times. That completely wipes out the theoretical gain, leaving only the implementation-dependent portion, which must be measured to be known.
At a guess, don't try to outsmart the standard library. The fact that std::erase_if
has a specialization for std::multimap
tells me that a good implementation will be more efficient than erasing each element individually. This is probably the gain you are looking for, which you already have by using std::erase_if
instead of writing your own loop. Go ahead and measure to be certain, but I would not expect a gain from using count_if
and clear
, certainly not a significant one.