I've made a simple Huffman encoding program to output individual encodings for characters and save the encoded file. This was for an assignment, and I was told that using const_cast
on heap.top()
is considered undefined behavior if we heap.pop()
afterwards, but I'm not sure I understand why.
I've read cppreference regarding the std::pop_heap
which is the underlying function called when we call heap.pop()
and I believe that a nullptr
in the comparison is still defined and understood. It doesn't seem to function abnormally to me when I debugged it.
Here's an example
#include <functional>
#include <queue>
#include <vector>
#include <iostream>
#include <memory>
template<typename T> void print_queue_constcast(T& q) {
while(!q.empty()) {
auto temp = std::move(const_cast<int&>(q.top()));
std::cout << temp << " ";
q.pop();
}
std::cout << '\n';
}
template<typename T> void print_queue(T& q) {
while(!q.empty()) {
std::cout << q.top() << " ";
q.pop();
}
std::cout << '\n';
}
int main() {
std::priority_queue<int> q1;
std::priority_queue<int> q2;
for(int n : {1,8,5,6,3,4,0,9,7,2}){
q1.push(n);
q2.push(n);
}
print_queue(q1);
print_queue_constcast(q2);
}
Could anyone explain what is actually going in the backgroun that'd be undefined behavior or that would cause this to fail under certain circumstances?
tl;dr: Maybe; maybe not.
Like a set, a priority_queue is in charge of ordering its elements. Any modification to an element would potentially "break" the ordering, so the only safe way to do that is via the container's own mutating methods. (In fact, neither one actually provides such a thing.) Directly modifying elements is dangerous. To enforce this, these containers only expose const
access to your elements.
Now, at the language level, the objects won't actually have a static type of const T
; most likely they're just T
s. So modifying them (after a const_cast
to cheat the type system) doesn't have undefined behaviour in that sense.
However, you are potentially breaking a condition of using the container. The rules for priority_queue don't ever actually say this, but since its mutating operations are defined in terms of functions like push_heap
and pop_heap
, your use of such operations will break preconditions of those functions if the container's ordering is no longer satisfied after your direct mutation.
Thus your program will have undefined behaviour if you break the ordering and later mutate the priority_queue in such a way that depends on the ordering being intact. If you don't, technically your program's behaviour is well-defined; however, in general, you'd still be playing with fire. A const_cast
should be a measure of last resort.
The question is: did you break the ordering? What's the state of the element after moving from it, and is the ordering satisfied by having an object in that state at the top of the queue?
Your original example uses shared_ptr
s, and we know from the documentation that a moved-from shared_ptr
turns safely into a null pointer.
The default priority_queue
ordering is defined by std::less
, which yields a strict total order over raw pointers; std::less
on a shared_ptr
will actually invoke its base case of operator<
, but that in turn is defined to invoke std::less
on its raw pointer equivalent.
Unfortunately, that doesn't mean that a null shared_ptr is ordered "first": though std::less
's pointer ordering is strict and total, where null pointers land in this ordering is unspecified.
So, it is unspecified as to whether your mutation will break the ordering, and therefore it is unspecified as to whether your pop()
will have undefined behaviour.
(The MCVE example with int
is safe because std::move
on an int
has no work to do: it'll just copy the int
. So, the ordering remains unaffected.)
I would agree with what was presumably your driving rationale, that it is unfortunate pop()
doesn't return you the popped thing, which you could then move
from. Similar restrictions with sets and maps are why we now have node splicing features for those containers. There is not such a thing for a priority_queue, which is just a wrapper around another container like a vector. If you need more fine-grained control, you can substitute that for your own which has the features you need.
Anyway, for the sake of a shared_ptr
increment (as in your original code), I'd probably just take the hit of the copy, unless you have some really extreme performance requirements. That way, you know everything will be well-defined.
Certainly, for the sake of an int
copy (as in your MCVE), a std::move
is entirely pointless (there are no indirect resources to steal!) and you're doing a copy anyway, so the point is rather moot and all you've done is to create more complex code for no reason.
I would also recommend not writing code where you have to ask whether it's well-defined, even if it turns out it is. That's not ideal for readability or maintainability.