c++polymorphismdynamic-caststatic-polymorphismcpp-core-guidelines

How to properly use "C++ Core Guidelines: C.146: Use dynamic_cast where class hierarchy navigation is unavoidable"


Motivation

The C++ Core Guidelines recommends using dynamic_cast when "class hierarchy navigation is unavoidable." This triggers clang-tidy to throw the following error: Do not use static_cast to downcast from a base to a derived class; use dynamic_cast instead [cppcoreguidelines-pro-type-static-cast-downcast].

The guidelines go on to say:

Note:

Like other casts, dynamic_cast is overused. Prefer virtual functions to casting. Prefer static polymorphism to hierarchy navigation where it is possible (no run-time resolution necessary) and reasonably convenient.

I have always just used an enum named Kind nested in my base class, and performed a static_cast based on its kind. Reading C++ Core Guidelines, "...Even so, in our experience such "I know what I'm doing" situations are still a known bug source." suggests that I should not be doing this. Often, I don't have any virtual functions so RTTI is not present to use dynamic_cast (e.g. I will get error: 'Base_discr' is not polymorphic). I can always add a virtual function, but that sounds silly. The guideline also says to benchmark before considering using the discriminant approach that I use with Kind.

Benchmark


enum class Kind : unsigned char {
    A,
    B,
};


class Base_virt {
public:
    Base_virt(Kind p_kind) noexcept : m_kind{p_kind}, m_x{} {}

    [[nodiscard]] inline Kind
    get_kind() const noexcept {
        return m_kind;
    }

    [[nodiscard]] inline int
    get_x() const noexcept {
        return m_x;
    }

    [[nodiscard]] virtual inline int get_y() const noexcept = 0;

private:
    Kind const m_kind;
    int m_x;
};


class A_virt final : public Base_virt {
public:
    A_virt() noexcept : Base_virt{Kind::A}, m_y{} {}

    [[nodiscard]] inline int
    get_y() const noexcept final {
        return m_y;
    }

private:
    int m_y;
};


class B_virt : public Base_virt {
  public:
    B_virt() noexcept : Base_virt{Kind::B}, m_y{} {}

  private:
    int m_y;
};


static void
virt_static_cast(benchmark::State& p_state) noexcept {
    auto const a = A_virt();
    Base_virt const* ptr = &a;

    for (auto _ : p_state) {
        benchmark::DoNotOptimize(static_cast<A_virt const*>(ptr)->get_y());
    }
}
BENCHMARK(virt_static_cast);


static void
virt_static_cast_check(benchmark::State& p_state) noexcept {
    auto const a = A_virt();
    Base_virt const* ptr = &a;

    for (auto _ : p_state) {
    if (ptr->get_kind() == Kind::A) {
        benchmark::DoNotOptimize(static_cast<A_virt const*>(ptr)->get_y());
        } else {
            int temp = 0;
        }       
    }
}
BENCHMARK(virt_static_cast_check);


static void
virt_dynamic_cast_ref(benchmark::State& p_state) {
    auto const a = A_virt();
    Base_virt const& reff = a;

    for (auto _ : p_state) {
        benchmark::DoNotOptimize(dynamic_cast<A_virt const&>(reff).get_y());
    }
}
BENCHMARK(virt_dynamic_cast_ref);


static void
virt_dynamic_cast_ptr(benchmark::State& p_state) noexcept {
    auto const a = A_virt();
    Base_virt const& reff = a;

    for (auto _ : p_state) {
        benchmark::DoNotOptimize(dynamic_cast<A_virt const*>(&reff)->get_y());
    }
}
BENCHMARK(virt_dynamic_cast_ptr);


static void
virt_dynamic_cast_ptr_check(benchmark::State& p_state) noexcept {
    auto const a = A_virt();
    Base_virt const& reff = a;

    for (auto _ : p_state) {
        if (auto ptr = dynamic_cast<A_virt const*>(&reff)) {
            benchmark::DoNotOptimize(ptr->get_y());
        } else {
            int temp = 0;
        }
    }
}
BENCHMARK(virt_dynamic_cast_ptr_check);


class Base_discr {
public:
    Base_discr(Kind p_kind) noexcept : m_kind{p_kind}, m_x{} {}

    [[nodiscard]] inline Kind
    get_kind() const noexcept {
        return m_kind;
    }

    [[nodiscard]] inline int
    get_x() const noexcept {
        return m_x;
    }

private:
    Kind const m_kind;
    int m_x;
};


class A_discr final : public Base_discr {
public:
    A_discr() noexcept : Base_discr{Kind::A}, m_y{} {}

    [[nodiscard]] inline int
    get_y() const noexcept {
        return m_y;
    }

private:
    int m_y;
};


class B_discr : public Base_discr {
public:
    B_discr() noexcept : Base_discr{Kind::B}, m_y{} {}

private:
    int m_y;
};


static void
discr_static_cast(benchmark::State& p_state) noexcept {
    auto const a = A_discr();
    Base_discr const* ptr = &a;

    for (auto _ : p_state) {
        benchmark::DoNotOptimize(static_cast<A_discr const*>(ptr)->get_y());
    }
}
BENCHMARK(discr_static_cast);


static void
discr_static_cast_check(benchmark::State& p_state) noexcept {
    auto const a = A_discr();
    Base_discr const* ptr = &a;

    for (auto _ : p_state) {
        if (ptr->get_kind() == Kind::A) {
            benchmark::DoNotOptimize(static_cast<A_discr const*>(ptr)->get_y());
        } else {
            int temp = 0;
        }
    }
}
BENCHMARK(discr_static_cast_check);

I am new to benchmarking, so I don't really know what I am doing. I took care to make sure that virtual and discriminant versions have the same memory layout and tried my best to prevent optimizations. I went with optimization level O1 since anything higher didn't seem representative. discr stands for discriminated or tagged. virt stands for virtual Here are my results:

Benchmark Results

Questions

So, my questions are: How should I cast from a base to a derived type when (1) I know the derived type because I checked it before entering the function and (2) when I do not know the derived type yet. Additionally, (3) Should I even be worried about this guideline, or should I disable the warning? Performance matters here, but sometimes it does not. What should I be using?

EDIT:

Using dynamic_cast seems to be the correct answer for downcasting. However, you still need to know what you are downcasting to and have a virtual function. In many cases, you do not know without a discriminate such as kind or tag what the derived class is. (4) In the case where I already have to check what the kind of object I am looking at, should I still be using dynamic_cast? Is this not checking the same thing twice? (5) Is there a reasonable way to do this without a tag?

Example

Consider the class hierarchy:

class Expr {
public:
    enum class Kind : unsigned char {
        Int_lit_expr,
        Neg_expr,
        Add_expr,
        Sub_expr,
    };

    [[nodiscard]] Kind
    get_kind() const noexcept {
        return m_kind;
    }

    [[nodiscard]] bool
    is_unary() const noexcept {
        switch(get_kind()) {
            case Kind::Int_lit_expr:
            case Kind::Neg_expr:
                return true;
            default:
                return false;
        }
    }

    [[nodiscard]] bool
    is_binary() const noexcept {
        switch(get_kind()) {
            case Kind::Add_expr:
            case Kind::Sub_expr:
                return true;
            default:
                return false;
        }
    }

protected:
    explicit Expr(Kind p_kind) noexcept : m_kind{p_kind} {}

private:
    Kind const m_kind;
};


class Unary_expr : public Expr {
public:
    [[nodiscard]] Expr const*
    get_expr() const noexcept {
        return m_expr;
    }

protected:
    Unary_expr(Kind p_kind, Expr const* p_expr) noexcept :
        Expr{p_kind},
        m_expr{p_expr} {}

private:
    Expr const* const m_expr;
};


class Binary_expr : public Expr {
public:
    [[nodiscard]] Expr const*
    get_lhs() const noexcept {
        return m_lhs;
    }

    [[nodiscard]] Expr const*
    get_rhs() const noexcept {
        return m_rhs;
    }

protected:
    Binary_expr(Kind p_kind, Expr const* p_lhs, Expr const* p_rhs) noexcept :
        Expr{p_kind},
        m_lhs{p_lhs},
        m_rhs{p_rhs} {}

private:
    Expr const* const m_lhs;
    Expr const* const m_rhs;
};


class Add_expr : public Binary_expr {
public:
    Add_expr(Expr const* p_lhs, Expr const* p_rhs) noexcept : 
        Binary_expr{Kind::Add_expr, p_lhs, p_rhs} {}
};

Now in main():

int main() {
    auto const add = Add_expr{nullptr, nullptr};
    Expr const* const expr_ptr = &add;

    if (expr_ptr->is_unary()) {
        auto const* const expr = static_cast<Unary_expr const* const>(expr_ptr)->get_expr();
    } else if (expr_ptr->is_binary()) {
        // Here I use a static down cast after checking it is valid
        auto const* const lhs = static_cast<Binary_expr const* const>(expr_ptr)->get_lhs();
    
        // error: cannot 'dynamic_cast' 'expr_ptr' (of type 'const class Expr* const') to type 'const class Binary_expr* const' (source type is not polymorphic)
        // auto const* const rhs = dynamic_cast<Binary_expr const* const>(expr_ptr)->get_lhs();
    }
}
<source>:99:34: warning: do not use static_cast to downcast from a base to a derived class [cppcoreguidelines-pro-type-static-cast-downcast]

        auto const* const expr = static_cast<Unary_expr const* const>(expr_ptr)->get_expr();

                                 ^

Not always will I need to cast to an Add_expr. For example, I could have a function that prints out any Binary_expr. It only need to cast it to Binary_expr to get the lhs and rhs. To get the symbol of the operator (e.g. '-' or '+' ...) it can switch on the kind. I don't see how dynamic_cast will help me here and I also have no virtual functions to use dynamic_cast on.

EDIT 2:

I have posted an answer making get_kind() virtual, this seems to be a good solution in general. However, I am now carrying around 8 bytes for a vtbl_ptr instead of a byte for a tag. Object instantiated from classes derived from Expr will far exceed any other object types. (6) Is this a good time to skip the vtbl_ptr or should I prefer the safety of dynamic_cast?


Solution

  • I think the important part of this guideline is the part about "where class hierarchy navigation is unavoidable". The basic point here being that, if you're wanting to do this kind of casting a lot, then odds are good that there is something wrong with your design. Either you picked the wrong way to do something or you designed yourself into a corner.

    Overuse of OOP is one example of such a thing. Let's take your example of Expr, which is a node in an expression tree. And you can ask it questions like whether it is a binary operation, unary operation, or a nullary operation (FYI: literal values are nullary, not unary. They take no arguments).

    Where you've overused OOP was in trying to give each operator its own class type. What is the difference between an addition operator and a multiplication operator? Precedence? That's a matter for the grammar; it's irrelevant once you've built the expression tree. The only operation that really cares about the specific binary operator is when you evaluate it. And even when doing evaluation, the only part that's special is when you take the results of the evaluation of the operands and feed it into the code that's going to produce the result of this operation. Everything else is the same for all binary operations.

    So you have one function that is different for various binary operations. If there's just one function that changes, you really don't need different types just for that. It's much more reasonable for the different binary operators to be different values within a general BinaryOp class. The same goes for UnaryOp and NullaryOps.

    So within this example, there are only 3 possible types for any given node. And that's very reasonable to deal with as a variant<NullaryOp, UnaryOp, BinaryOp>. So an Expr can just contain one of those, with each operand type having zero or more pointers to its child Expr elements. There could be a generic interface on Expr for getting the number of children, iterating through the children, etc. And the different Op types can provide the implementations for these through simple visitors.

    Most cases when you start to want to do downcasting and such things are cases that can be solved better and more cleanly using other mechanisms. If you're building hierarchies without virtual functions, where code receiving base classes already knows most or all of the possible derived classes, odds are good that you're really writing a crude form of variant.