c++templatesif-statementc++11type-traits

What do compilers do with compile-time branching?


EDIT: I took the "if/else" case as an example that can sometimes be resolved at compile time (eg when static values are involved, cf <type_traits>). Adapting the answers below to other types of static branching (eg, multiple branches or multi-criteria branches) should be straightforward. Note that compile-time branching using template-meta programming is not the topic here.


In a typical code like this

#include <type_traits>

template <class T>
T numeric_procedure( const T& x )
{
    if ( std::is_integral<T>::value )
    {
        // Integral types
    }
    else
    {
        // Floating point numeric types
    }
}

will the compiler optimize the if/else statement out when I define specific template types later on in my code?

A simple alternative would be to write something like this:

#include <type_traits>

template <class T>
inline T numeric_procedure( const T& x )
{
    return numeric_procedure_impl( x, std::is_integral<T>() );
}

// ------------------------------------------------------------------------

template <class T>
T numeric_procedure_impl( const T& x, std::true_type const )
{
    // Integral types
}

template <class T>
T numeric_procedure_impl( const T& x, std::false_type const )
{
    // Floating point numeric types
}

Is there a difference in terms of performance between these solutions? Is there any non-subjective grounds for saying that one is better than the other? Are there other (possibly better) solutions to deal with compile-time branching?


Solution

  • TL;DR

    There are several ways to get different run-time behavior dependent on a template parameter. Performance is usually equal, so flexibility and maintainability are the main concern. In all cases, the various thin wrappers and constant conditional expressions will all be optimized away on any decent compiler for release builds. Below a small summary with the various tradeoffs (inspired by this answer by @AndyProwl).

    Run-time if

    Your first solution is the simple run-time if:

    template<class T>
    T numeric_procedure(const T& x)
    {
        if (std::is_integral<T>::value) {
            // valid code for integral types
        } else {
            // valid code for non-integral types,
            // must ALSO compile for integral types
        }
    }
    

    It is simple and effective: any decent compiler will optimize away the dead branch.

    There are several disadvantages:

    Tag dispatching

    Your second approach is known as tag-dispatching:

    template<class T>
    T numeric_procedure_impl(const T& x, std::false_type)
    {
        // valid code for non-integral types,
        // CAN contain code that is invalid for integral types
    }    
    
    template<class T>
    T numeric_procedure_impl(const T& x, std::true_type)
    {
        // valid code for integral types
    }
    
    template<class T>
    T numeric_procedure(const T& x)
    {
        return numeric_procedure_impl(x, std::is_integral<T>());
    }
    

    It works fine, without run-time overhead: the temporary std::is_integral<T>() and the call to the one-line helper function will both be optimized way on any decent platform.

    The main (minor IMO) disadvantage is that you have some boilerplate with 3 instead of 1 function.

    SFINAE

    Closely related to tag-dispatching is SFINAE (Substitution failure is not an error)

    template<class T, class = typename std::enable_if<!std::is_integral<T>::value>::type>
    T numeric_procedure(const T& x)
    {
        // valid code for non-integral types,
        // CAN contain code that is invalid for integral types
    }    
    
    template<class T, class = typename std::enable_if<std::is_integral<T>::value>::type>
    T numeric_procedure(const T& x)
    {
        // valid code for integral types
    }
    

    This has the same effect as tag-dispatching but works slightly differently. Instead of using argument-deduction to select the proper helper overload, it directly manipulates the overload set for your main function.

    The disadvantage is that it can be a fragile and tricky way if you don't know exactly what the entire overload set is (e.g. with template heavy code, ADL could pull in more overloads from associated namespaces you didn't think of). And compared to tag-dispatching, selection based on anything other than a binary decision is a lot more involved.

    Partial specialization

    Another approach is to use a class template helper with a function application operator and partially specialize it

    template<class T, bool> 
    struct numeric_functor;
    
    template<class T>
    struct numeric_functor<T, false>
    {
        T operator()(T const& x) const
        {
            // valid code for non-integral types,
            // CAN contain code that is invalid for integral types
        }
    };
    
    template<class T>
    struct numeric_functor<T, true>
    {
        T operator()(T const& x) const
        {
            // valid code for integral types
        }
    };
    
    template<class T>
    T numeric_procedure(T const& x)
    {
        return numeric_functor<T, std::is_integral<T>::value>()(x);
    }
    

    This is probably the most flexible approach if you want to have fine-grained control and minimal code duplication (e.g. if you also want to specialize on size and/or alignment, but say only for floating point types). The pattern matching given by partial template specialization is ideally suited for such advanced problems. As with tag-dispatching, the helper functors are optimized away by any decent compiler.

    The main disadvantage is the slightly larger boiler-plate if you only want to specialize on a single binary condition.

    If constexpr (C++1z proposal)

    This is a reboot of failed earlier proposals for static if (which is used in the D programming language)

    template<class T>
    T numeric_procedure(const T& x)
    {
        if constexpr (std::is_integral<T>::value) {
            // valid code for integral types
        } else {
            // valid code for non-integral types,
            // CAN contain code that is invalid for integral types
        }
    }
    

    As with your run-time if, everything is in one place, but the main advantage here is that the else branch will be dropped entirely by the compiler when it is known not to be taken. A great advantage is that you keep all code local, and do not have to use little helper functions as in tag dispatching or partial template specialization.

    Concepts-Lite (C++1z proposal)

    Concepts-Lite is an upcoming Technical Specification that is scheduled to be part of the next major C++ release (C++1z, with z==7 as the best guess).

    template<Non_integral T>
    T numeric_procedure(const T& x)
    {
        // valid code for non-integral types,
        // CAN contain code that is invalid for integral types
    }    
    
    template<Integral T>
    T numeric_procedure(const T& x)
    {
        // valid code for integral types
    }
    

    This approach replaces the class or typename keyword inside the template< > brackets with a concept name describing the family of types that the code is supposed to work for. It can be seen as a generalization of the tag-dispatching and SFINAE techniques. Some compilers (gcc, Clang) have experimental support for this feature. The Lite adjective is referring to the failed Concepts C++11 proposal.