regexcmakenumbersidiomsstring-parsing

In CMake, how do I check whether a string contains an integral number?


I have a string in CMake which I got somehow, in a variable MYVAR. I want to check whether that string is an integral number - possibly with whitespace. I have an ugly way to do it:

string(REGEX MATCH "^[ \t\r\n]*[0-9]+[ \t\r\n]*$" MYVAR_PARSED "${MYVAR}")
if ("${MYVAR_PARSED}" STREQUAL "")
    message(FATAL_ERROR "Oh no!" )
endif()
# Now I can work with MYVAR as a number

is there a better way? Or - should I just wrap this in a function?

Note: I'm using the CMake regex syntax as documented here.


Solution

  • Option 1: Exact match

    I want to check whether that string is an integral number - possibly with whitespace.

    If this is the exact spec I need, then I would check it like so:

    string(STRIP "${MYVAR}" MYVAR_PARSED)
    if (NOT MYVAR_PARSED MATCHES "^[0-9]+$")
        message(FATAL_ERROR "Expected number, got '${MYVAR}'")
    endif ()
    

    This first removes whitespace from MYVAR, storing the result in MYVAR_PARSED. Then, it checks that MYVAR_PARSED is a non-empty sequence of digits and errors out if it is not.

    I think doing this ad-hoc is fine, but if you want a function:

    function(ensure_int VAR VALUE)
      string(STRIP "${VALUE}" parsed)
      if (NOT parsed MATCHES "^[0-9]+$")
        message(FATAL_ERROR "Expected number, got '${VALUE}'")
      endif()
      set(${VAR} "${parsed}" PARENT_SCOPE)
    endfunction()
    
    ensure_int(MYVAR_PARSED "${MYVAR}")
    

    Option 2: Looser match

    However, the following solution might in some cases be more robust, depending on your requirements:

    math(EXPR MYVAR_PARSED "${MYVAR}")
    

    This will interpret the value of MYVAR as a simple mathematical expression over 64-bit signed C integers. It will interpret 0x-prefixed numbers in hex. It recognizes most C arithmetic operators.

    Documentation for the math command can be found here: https://cmake.org/cmake/help/latest/command/math.html

    On the other hand, it might be too permissive: this solution will accept things like 0 + 0x3. This might not be an issue if you subsequently validate the range of the number or something. You could, for instance, check if (MYVAR_PARSED LESS_EQUAL 0) and then error out if so.

    Safely checking expression validity

    Unfortunately, math(EXPR) will issue a fatal error if it cannot parse the expression. To get around this, you can run the command in a subprocess (ugly, but hey):

    function(value_if_numeric OUTVAR EXPR)
      string(REGEX MATCHALL "=+" barrier "${EXPR}")
      string(REGEX REPLACE "." "=" barrier "=${barrier}")
    
      file(
        CONFIGURE OUTPUT _value_if_numeric.cmake
        CONTENT [[
    cmake_minimum_required(VERSION 3.13)
    math(EXPR value [@barrier@[@EXPR@]@barrier@])
    message(STATUS "${value}")
    ]]
        @ONLY
      )
    
      execute_process(
        COMMAND "${CMAKE_COMMAND}" -P "${CMAKE_CURRENT_BINARY_DIR}/_value_if_numeric.cmake"
        OUTPUT_VARIABLE output OUTPUT_STRIP_TRAILING_WHITESPACE
        RESULT_VARIABLE exit_code
        ERROR_QUIET
      )
    
      file(REMOVE "${CMAKE_CURRENT_BINARY_DIR}/_value_if_numeric.cmake")
    
      if (exit_code EQUAL 0 AND output MATCHES "^-- (.+)$")
        set("${OUTVAR}" "${CMAKE_MATCH_1}" PARENT_SCOPE)
      else ()
        set("${OUTVAR}" "NO" PARENT_SCOPE)
      endif ()
    endfunction()
    

    A few test cases:

    value_if_numeric(value "0xF00F00")
    message(STATUS "value = ${value}")
    
    value_if_numeric(value "3 + / 4")
    message(STATUS "value = ${value}")
    
    value_if_numeric(value "[=[foo]=]")
    message(STATUS "value = ${value}")
    

    Produces:

    -- value = 15732480
    -- value = NO
    -- value = NO