cunit-testingsegmentation-faultc-criterion

Segfault trying to access a static string, but only sometimes depending on build environment


Wait, come back, I promise this isn't about uninitialized pointers!

The problem

I have written some unit tests using Criterion. The code under test doesn't matter; the problem crops up in the test itself. Here's a simplified version of the test:

#include <stdio.h>
#include <criterion/criterion.h>
#include <criterion/parameterized.h>

typedef struct {
    char *input;
} paramspec;

TestSuite(Example);

ParameterizedTestParameters(Example, test_example) {
    static paramspec params[] = {
        {"this is a test"},
    };

    size_t nb_params = sizeof (params) / sizeof (paramspec);
    return cr_make_param_array(paramspec, params, nb_params);
}

ParameterizedTest(paramspec *param, Example, test_example) {
    printf("input value is: %s\n", param->input);
}

When compiled in an Ubuntu 22.04 container, using either gcc-11 (11.4.0) or gcc-10 (10.5.0), running this test produces:

[====] Running 1 test from Example:
[RUN ] Example::test_example
[----] test_example.c:20: Unexpected signal caught below this line!
[FAIL] Example::test_example: CRASH!
[====] Synthesis: Tested: 1 | Passing: 0 | Failing: 1 | Crashing: 1

It doesn't say it in the output, but that's a SIGSEGV. If I attach to the test with gdb and print the value of *param, I see error: Cannot access memory at address ...:

Thread 1 "test_example" hit Breakpoint 1, Example_test_example_impl (param=0x7ffff7fa1330)
    at test_example.c:21
21          printf("input value is: %s\n", param->input);
(gdb) p *param
$1 = {input = 0x55abdb8e81ec <error: Cannot access memory at address 0x55abdb8e81ec>}

But!

The mystery thickens

If I build the code under Fedora 34 (which I selected because that includes gcc 11.3.1, which is a close match for 11.4.0), the code works just fine:

[====] Running 1 test from Example:
[RUN ] Example::test_example
input value is: this is a test
[PASS] Example::test_example: (0.00s)
[====] Synthesis: Tested: 1 | Passing: 1 | Failing: 0 | Crashing: 0

And that code runs fine not only in the Fedora environment in which it was built -- it also runs without error in the Ubuntu environment!

In both environments, gdb is able to see the string value:

(gdb) p *param
$1 = {input = 0x41aac9 "this is a test"}

The question

What aspect of the build environment is resulting in segfault? That's just a static string being accessed by code in the same file; there's no pointer allocation to go awry or anything like that.

On the Ubuntu side I've built this with gcc-{9,10,11} and the behavior is identical in all cases.

Building with sanitizers

Building the code with -fsanitize=undefined,address results in the following:

==16==ERROR: AddressSanitizer: SEGV on unknown address 0x5594e31d7400 (pc 0x7f1491d4e086 bp 0x7ffdfe1765b0 sp 0x7ffdfe175cf8 T0)
==16==The signal is caused by a READ memory access.
    #0 0x7f1491d4e086 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167
    #1 0x7f1491cdf2ed in printf_common ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors_format.inc:551
    #2 0x7f1491cdf6cc in __interceptor_vprintf ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:1660
    #3 0x7f1491cdf7c6 in __interceptor_printf ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:1718
    #4 0x555fd432c365 in Example_test_example_impl /src/test_example.c:21
    #5 0x7f1491c18298 in criterion_internal_test_main ../src/core/test.c:97
    #6 0x555fd432c2e7 in Example_test_example_jmp /src/test_example.c:20
    #7 0x7f1491c16849 in run_test_child ../src/core/runner_coroutine.c:230
    #8 0x7f1491c28a92 in bxfi_main ../subprojects/boxfort/src/sandbox.c:57
    #9 0x7f14913ced8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #10 0x7f14913cee3f in __libc_start_main_impl ../csu/libc-start.c:392
    #11 0x555fd432c1a4 in _start (/src/test_example+0x21a4)

AddressSanitizer can not provide additional info.

...but that's pretty much what gdb was telling us earlier. This error only crops up when building on Ubuntu; I don't see a similar error on the Fedora build.

A complete reproducer

If anyone is interested in taking a closer look, I've put together a complete reproducer here, that includes the test code, Makefile, Dockerfile, and a README.


Solution

  • OK, after having the testcase on hand, this seems easy.

    First, I am testing on Debian and noticed that it does crash there.

    Second, quite astonishingly it didn't crash when i ran the test binary under GDB. But I noticed this line:

    [Detaching after fork from child process 3071548]
    

    So there are multiple processes there. And with strace I found it indeed fork()s + exec()s a child and it's the child which crashes.

    So I added a debug printf (don't know why you hadn't done that yourself, from your comment I got the impression that you checked this):

    const char* const test_msg = "this is a test";
    ParameterizedTestParameters(Example, test_example) {
        static paramspec params[] = {
            {test_msg},
        };
    
        printf("param is %p, %llx, %p\n", params, *(intptr_t*)params, test_msg);
        size_t nb_params = sizeof (params) / sizeof (paramspec);
        return cr_make_param_array(paramspec, params, nb_params);
    }
    
    ParameterizedTest(paramspec *param, Example, test_example) {
        printf("param is %p, %llx, %p\n", param, *(intptr_t*)param, test_msg);
    

    and the penny's beginning to drop:

    param is 0x557fdad5e040, 557fdad5c004, 0x557fdad5c004
    param is 0x7f764ba45330, 557fdad5c004, 0x5632b4277004
    

    See how the pointer in the param is the same, but the actual address of the string differs. This is caused by ASLR. So the child has static data on a different address than the parent, but the parent passes (through shared memory, it seems) the pointer, which isn't of much use to the child, verbatim.

    Your best way to fix it is to use, as the Criterion documentation suggests, a dynamically allocated parameter and the function cr_malloc.

    static paramspec params[] = {
        {0},
    };
    params[0].input = cr_malloc(strlen(test_msg)+1);
    strcpy(params[0].input, test_msg);