Multiple embedded Perl instances in multithreaded environment

Currently I am developing an ASP.NET application that for legacy reasons needs to execute some Perl scripts. For this I wrote a small C++ library that uses the embedded Perl API. This library has one single (C) entry point that allows the C# code to execute a script while passing command line arguments and an environment. This environment allows the C# code to mimic a CGI call for the Perl script.

Now I am seeing something strange. The C# application allows for (a maximum number of) concurrent Perl script to be executed. For each call a different environment is created and this is passed to my C++ library that then passes this environment to the perl_parse function. What I then notice is that the actual environment seen by the Perl script does not match the actual environment passed, but seems to be an old copy. So my question is, am I overlooking something? Is there a special way to run multiple embedded Perl interpreters concurrently? Also when I limit the number of concurrent threads to 1, the issue is still there.

The relevant parts of the C++ library are as follows:

extern "C" __declspec(dllexport) BOOL ExecutePerlScript(PCSTR* environmentVariables,
                                                        PCSTR path)
{
    BOOL result(FALSE);

    // Create the Perl interpreter
    PerlInterpreter* my_perl(perl_alloc());
    if (NULL != my_perl)
    {
        PERL_SET_CONTEXT(my_perl);
        PL_perl_destruct_level = 1;
        perl_construct(my_perl);
        PL_origalen = 1;
        PL_exit_flags |= PERL_EXIT_DESTRUCT_END;

        // Initialize the Perl interpreter
        result = (perl_parse(my_perl,
                             XsInit,
                             NR_DEFAULT_ARGUMENTS,
                             DEFAULT_ARGUMENTS,
                             const_cast<char**>(environmentVariables)) == 0) ? TRUE : FALSE;

        // Run the interpreter
        if (result)
        {
            result = (perl_run(my_perl) == 0) ? TRUE : FALSE;
        }

        if (result)
        {
            result = LoadFile(path,
                              my_perl);
        }

        if (result)
        {
            // Execute the Perl script
            eval_pv("eval \"$" SCRIPT_TO_EVALUATE_VARIABLE_NAME "; 1\" or do { $" SCRIPT_EXECUTION_ERROR_VARIABLE_NAME " = $@; }",
                    TRUE);
        }

        // Destruct the interpreter
        PL_perl_destruct_level = 1;
        perl_destruct(my_perl);
        perl_free(my_perl);
    }

    return result;
}

extern "C" BOOL WINAPI DllMain(HINSTANCE hinstDLL,
                               DWORD fdwReason,
                               LPVOID lpvReserved)
{
    BOOL result(FALSE);

    switch (fdwReason)
    {
    case DLL_PROCESS_ATTACH:
        if (0 == g_initCount)
        {
            PERL_SYS_INIT3(0,
                           NULL,
                           NULL);
        }
        g_initCount++;
        result = TRUE;
        break;
    case DLL_PROCESS_DETACH:
        if (g_initCount > 0)
        {
            g_initCount--;
            if (0 == g_initCount)
            {
                PERL_SYS_TERM();
            }
        }
        result = TRUE;
        break;
    }

    return result;
}

The format of environmentVariables in the above snippet is an array of char* where each element is in the form <variable name>=<variable value> and the last element of the array is NULL.

The Perl script I run is as follows:

use strict;

use CGI qw/:standard/;

print "---- ENVIRONMENT ----\n";
for my $env (sort keys %ENV)
{
    print "$env = $ENV{$env}\n";
}
print "\n";

For example, one of the executions (in a loop) passes the following environment to the C++ function:

- AUTH_TYPE =
- CONTENT_LENGTH = 47
- CONTENT_TYPE = application/x-www-form-urlencoded
- GATEWAY_INTERFACE = CGI/1.1
- PATH_INFO = /test.pl
- PATH_TRANSLATED = E:\Perl\PerlTestApplication\test.pl
- QUERY_STRING = lang=nl
- REMOTE_ADDR = 1.2.3.4
- REMOTE_HOST = remote.host
- REMOTE_USER =
- REQUEST_METHOD = POST
- SCRIPT_NAME = /test.pl
- SERVER_NAME = example.domain
- SERVER_PORT = 443
- SERVER_PROTOCOL = HTTP/1.1
- SERVER_SOFTWARE = Microsoft-IIS/10.0

and then the script prints the following environment:

---- ENVIRONMENT ----
AUTH_TYPE =
CONTENT_LENGTH = 45
CONTENT_TYPE = application/x-www-form-urlencoded
GATEWAY_INTERFACE = CGI/1.1
PATH_INFO = /test.pl
PATH_TRANSLATED = E:\Perl\PerlTestApplication\test.pl
QUERY_STRING = lang=nl
REMOTE_ADDR = 1.2.3.4
REMOTE_HOST = remote.host
REMOTE_USER =
REQUEST_METHOD = POST
SCRIPT_NAME = /test.pl
SERVER_NAME = example.domain
SERVER_PORT = 443
SERVER_PROTOCOL = HTTP/1.1
SERVER_SOFTWARE = Microsoft-IIS/10.0

As can be seen the value of the CONTENT_LENGTH variable is different and in the Perl environment is the same as an environment passed earlier to the script. So somehow, the environment that I pass to the new Perl interpreter instance is not cleaned up and another environment is still used. I already use the PERL_SET_CONTEXT to set the context in the current thread right after construction, but that just doesn't seem to be enough.

I have tried this both on an Active Perl installation of Perl 5.24 and on a Strawberry Perl installation of Perl 5.30, but both give the same erroneous result.

What am I doing wrong?

Solution

In the source code for Perl, specifically perl_parse the env parameter isn't used. So I'm guessing it's reusing the current processes environment? To get around this, you could try something like this:

HV* envHV = get_hv("main::ENV", 0);

if (envHV) {
    auto envPtr = const_cast<char**>(environmentVariables);
    for (int i = 0; envPtr[i]; i++) {
        std::string str(envPtr[i]);
        std::string key = str.substr(0, str.find("="));
        str.erase(0, str.find("=") + 1);

        SV* nsv = newSVpvn(str.c_str(), str.length());
        hv_store(envHV, key.c_str(), key.length(), nsv, 0); // let hv_store hash for us
    }  
}

eval_pv("my $ eval \"$" SCRIPT_TO_EVALUATE_VARIABLE_NAME "; 1\" or do { $" SCRIPT_EXECUTION_ERROR_VARIABLE_NAME " = $@; }", TRUE);

Here we get the %ENV hash, parse out the environmentVariables array, and insert them into the hash. Then, we execute the script as usual. Note: This does make the assumption that environmentVariables ends with a nullptr, so adjust as needed.

Another option is to store these variables in another HV, think something like main::CGI_ENV if you don't want to overwrite existing values in the current environment. Then you'd just do $CGI_ENV{key}.