Secure way to forbid a function to write to stdout

(if you suspect a XY problem, see details of what I want to do below)

I need a way to automatically test functions (written by students during an exam) to check that they are behaving correctly. The current system basically runs the function like and writes OK at the end, like in:

if (student_function(42) == 150) {
  echo "OK";
}

and an external tool checks if the answer contains OK (our tool can only run binaries and check their stdout). Of course, this is not secure as a student can do int student_function(int foo) { printf("OK"); exit(1) } and pass all tests. So I wrote some code to redirect the stdout using dup2 to a local pipe (see below), but the students are still able to guess the correct file descriptor of stdout and write to it (I tested and in practice it is a small, easy to guess number).

Hence my question: what would be a secure way to redirect all outputs of a given function inside a C file? Would it make sense to fork the process, and use some magic based on seccomp etc? (note that in some cases I may still want to allow the users to create and write new files) Another solution is to use tokens, but this would not be resilient against students that can inspect the binary (which they can since they can run any code during execution).

Here is what I wrote so far, but it is insecure:

#include <stdio.h>
#include <unistd.h>

void foo() {
    printf("Hello from foo!\n");
    // On teste aussi les write "low level"
    char buffer[4] = {65, 65, 65, 65};
    int r = write(STDOUT_FILENO, buffer, sizeof(buffer));
    // Attack:
    int r2 = write(5, buffer, sizeof(buffer)); // the important part, the value of saved_stdout is always 5, and anyway easy to guess.
}

int main() {
    int pipefd[2];
    if (pipe(pipefd) == -1) {
        perror("pipe");
        return 1;
    }

    printf("Je vais lancer foo()\n");
    
    int saved_stdout = dup(STDOUT_FILENO); // Save stdout
    dup2(pipefd[1], STDOUT_FILENO); // redirects stdout to the pipe
    close(pipefd[1]);

    foo(); // Foo()'s write(1) and printf go to the pipe instead of stdin… but way to cheat by doing write(5, ) instead.

    // Needed otherwise non-flushed content will be written to stdin
    // instead of in the pipe
    fflush(stdout);
    
    
    // Restore stdout
    dup2(saved_stdout, STDOUT_FILENO);
    close(saved_stdout);

    printf("Finished to run foo()\n");

    // Read (part of the) output of foo written in the pipe
    char buffer[256];
    ssize_t n = read(pipefd[0], buffer, sizeof(buffer) - 1);
    if (n > 0) {
        buffer[n] = '\0';
        printf("Captured: %s", buffer);
    }
    close(pipefd[0]);

    return 0;
}

EDIT: after asking my question, people mentionned that I might run in a XY problem… So what I'm trying to do, basically, is to write an automated system that securely (i.e. students can't pass tests without writing a valid code) checks if some tests written by students pass. Importantly, I want to write tests in the form of unit-tests compared to a "run program of student, check if output is correct" for multiple reasons:

not only it is much more practical to write unit tests (otherwise I need to ask students to parse inputs, run different functions based on this input, serialize the output (data types can be complex, and verification is not trivial), and then I need to deserialize it and test if the result is correct… not worth the effort & time!)
and most importantly the verification is quite complex, and I don't want to tell my student to write one test for each verification I want to do (also because this reveals what I'm testing and I don't want to share this with students). For instance, I way want to check that function X is not deterministic, that function X composed with function Y and function Z gives True etc… and this is much easier to code in a unit-test.

Yet, simply running a unit-test of mine calling the code of the student and checking if the result is Test passed is not secure: a student can just write printf("Test passed"); exit(0) and the test will always pass for no good reason.

Hence my generic question: how can I make verification of code secure and efficient?

NB: even if one can solve my original problem without forbidding a function to write to stdout, I'm still curious to know if one can do this. I guess it must somehow be possible since this is what shell do all day long…

EDIT2 Seems like people want more details on what exactly is given to the students… More precisely, our university uses the moodle plugin VPL https://vpl.dis.ulpgc.es/documentation/vpl-3.4.3+. This way, the students can execute their own code, and run a test-suite written by the teacher. In practice, VPL works as follows:

the student submits code in arbitrary files like here They can submit it in different modes (buttons on top line): run mode to run their own code, debug mode to debug their own code, or test mode to run the tests written by the teacher. All these modes can be customized at will by the teacher with arbitrary shell scripts.
some code is compiled into a binary B (we have lot's of freedom here to compile anything we want (we write any script we want), like our own unit tests importing code from the student, just the student code itself etc…)
the source code is removed if needed (e.g. if the test code contains the correction it helps to obfuscate it)
the final binary B is executed, its stdin & stderr is saved
a trusted script S is then ran to check if the stdin/stderr of the previous script has some properties, typically if it contains the right string. If B runs a unit-test for instance, S can check if B printed "Test passed"… but it would not be secure.
the student can read the output of the code B and S, see their grade (if auto grading is enabled and when ran in grading mode), and re-submit if they want their code.

This means that the students can basically have access to the binary file B (e.g. in their code just printf all files in the current folder), so the can update their code to adapt to B. In particular, if we output random tokens in B when a test passes and check in S that B outputted the correct tokens, the student can simply execute strings B, and the token used in B will be revealed, allowing students to update their code into printf("SECRETTOKEN"); exit(0);.

The good thing with VPL is that I have lot's of freedom do to anything I want, since any step is basically a bash script I can edit myself as a teacher.

Solution

Currently, there's no trust boundary between the student's function and the verification code: they're in the same process, they're in the same address space, they can call the same APIs, and they have the same access to the file system.

With this arrangement, it's impossible to guarantee a student can't defeat the checker. Anything you could do programmatically to disable or redirect stdout before calling the student function could be undone or bypassed by the student function.

(You might be able to make it difficult enough to be impractical, but that would be treating this like an X-Y problem.)

I guess it must somehow be possible since this is what shell do all day long…

The shell accomplishes this by spinning up a new process for each program.

This is what web browsers do, too. Untrusted or unreliable code runs in a separate process with limited access.

how can I make verification of code secure and efficient?

Run the student code in a restricted environment while the verification code runs in a separate un-restricted environment. I can think of two ways to do that:

Option 1: Run the student's code with a C interpreter that provides limited functionality. You would need to marshal arguments and return values between the verification code and the interpreter.

Option 2: Compile the student's code into a standalone program that enables remote procedure calls. Run the verification program in one process and have it spawn the student code in another. (I believe this is how those online coding challenge sites work.)