c++c++17priority-queueundefined-behaviorconst-cast

Why does const_casting a heap.top() of priority_queue have undefined behavior?


I've made a simple Huffman encoding program to output individual encodings for characters and save the encoded file. This was for an assignment, and I was told that using const_cast on heap.top() is considered undefined behavior if we heap.pop() afterwards, but I'm not sure I understand why.

I've read cppreference regarding the std::pop_heap which is the underlying function called when we call heap.pop() and I believe that a nullptr in the comparison is still defined and understood. It doesn't seem to function abnormally to me when I debugged it.

Here's an example

#include <functional>
#include <queue>
#include <vector>
#include <iostream>
#include <memory>

template<typename T> void print_queue_constcast(T& q) {
    while(!q.empty()) {
        auto temp = std::move(const_cast<int&>(q.top()));
        std::cout << temp << " ";
        q.pop();
    }
    std::cout << '\n';
}

template<typename T> void print_queue(T& q) {
    while(!q.empty()) {
        std::cout << q.top() << " ";
        q.pop();
    }
    std::cout << '\n';
}

int main() {
    std::priority_queue<int> q1;
    std::priority_queue<int> q2;
 
    for(int n : {1,8,5,6,3,4,0,9,7,2}){
        q1.push(n);
        q2.push(n);
    }
        
    print_queue(q1);
    print_queue_constcast(q2);
    
    
}

Could anyone explain what is actually going in the backgroun that'd be undefined behavior or that would cause this to fail under certain circumstances?


Solution

  • tl;dr: Maybe; maybe not.


    Language-level safety

    Like a set, a priority_queue is in charge of ordering its elements. Any modification to an element would potentially "break" the ordering, so the only safe way to do that is via the container's own mutating methods. (In fact, neither one actually provides such a thing.) Directly modifying elements is dangerous. To enforce this, these containers only expose const access to your elements.

    Now, at the language level, the objects won't actually have a static type of const T; most likely they're just Ts. So modifying them (after a const_cast to cheat the type system) doesn't have undefined behaviour in that sense.


    Library-level safety

    However, you are potentially breaking a condition of using the container. The rules for priority_queue don't ever actually say this, but since its mutating operations are defined in terms of functions like push_heap and pop_heap, your use of such operations will break preconditions of those functions if the container's ordering is no longer satisfied after your direct mutation.

    Thus your program will have undefined behaviour if you break the ordering and later mutate the priority_queue in such a way that depends on the ordering being intact. If you don't, technically your program's behaviour is well-defined; however, in general, you'd still be playing with fire. A const_cast should be a measure of last resort.


    So, where do we stand?

    The question is: did you break the ordering? What's the state of the element after moving from it, and is the ordering satisfied by having an object in that state at the top of the queue?

    (The MCVE example with int is safe because std::move on an int has no work to do: it'll just copy the int. So, the ordering remains unaffected.)


    Conclusion

    I would agree with what was presumably your driving rationale, that it is unfortunate pop() doesn't return you the popped thing, which you could then move from. Similar restrictions with sets and maps are why we now have node splicing features for those containers. There is not such a thing for a priority_queue, which is just a wrapper around another container like a vector. If you need more fine-grained control, you can substitute that for your own which has the features you need.

    Anyway, for the sake of a shared_ptr increment (as in your original code), I'd probably just take the hit of the copy, unless you have some really extreme performance requirements. That way, you know everything will be well-defined.

    Certainly, for the sake of an int copy (as in your MCVE), a std::move is entirely pointless (there are no indirect resources to steal!) and you're doing a copy anyway, so the point is rather moot and all you've done is to create more complex code for no reason.

    I would also recommend not writing code where you have to ask whether it's well-defined, even if it turns out it is. That's not ideal for readability or maintainability.