cwinapivscode-extensions

Win32 child process not flushing stdout until exit


I'm starting to write an LSP extension for VSCode for my language in C (on Windows). I'm having trouble with getting my extremely simple C program to flush stdout when spawned as a child process (VSCode does this). When I run the same program directly from the command line, it works as expected. I have a simple Python script that simulates the LSP protocol and I see the same behavior here as I do when running under VSCode.

Here is the simplest C program I can use to simulate the error. (I had a C++ program that behaved the same way as well with std::cout and friends):

#include <windows.h>
#include <stdio.h>

int main() {
    char buffer[1024];
    // Infinite loop to continuously listen for LSP requests
    // without the loop it works fine. 
    while (1) {

        if (fgets(buffer, 1024, stdin) == NULL) {
            break;
        }

        const char* contentOffset = buffer + 16; // read content value (16 is the size of "Content-Length:"

        char* pEnd;
        size_t len = strtoll(contentOffset, &pEnd, 10) + 1;

        fgets(buffer, 1024, stdin); // read empty line

        // Read the JSON body based on Content-Length
        fgets(buffer, (int32) len, stdin);

        // Process the incoming JSON body (simple echo back for testing)
        // Echo the same JSON body back with the appropriate LSP header
        printf("Content-Length: %d\r\n\r\n%s", (int32_t)len - 1, buffer); // I've also tried fprintf and the windows apis as well, same effect
        // Flush stdout to ensure the message is sent immediately (this seems to have no effect)
        fflush(stdout);
    }
    return 0;
}

And here is a simple Python script to simulate the LSP:

import subprocess
import time

# Start the C program
process = subprocess.Popen(
    ["path/to/exe"],  
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    universal_newlines=True,  # Ensure text mode
)

# Function to send LSP request and read response
def send_lsp_request(json_request):
    content_length = len(json_request)

    # Send the LSP-style request
    lsp_message = f"Content-Length: {content_length}\r\n\r\n{json_request}"
    process.stdin.write(lsp_message)
    process.stdin.flush()

    # Allow some time for the C program to respond
    time.sleep(0.1)

    # Read and print the response from stdout
    response = process.stdout.read(1024)
    return response

# Example LSP request
json_request = '{"jsonrpc": "2.0", "id": 1, "method": "initialize", "params": {}}'
print('sent')
# Send a request to the C LSP server and print the response
response = send_lsp_request(json_request)
print("Response from C program:\n", response)

I expect the program to behave the same under both conditions. Instead, when run with the Python script, it prints sent and then hangs forever waiting for the process to put something in stdout, which never happens.

If I remove the while loop in the C program, I get the expected output, but I can't actually do that because the program needs to continue listening for more messages.

I have no idea why this happens. I've read something about handles being inherited, but I don't control what VSCode does with that, and I'd be surprised if the stdio transport supplied by VSCode didn't do the right thing for this. I don't have other hardware to tell if this is a Windows-only issue.


Solution

  • The bug is in your Python program. process.stdout.read(1024) waits for 1024 bytes to arrive (or EOF), performing multiple calls to the OS if needed.

    I believe this is the relevant documentation.

    read(size=-1, /)

    Read and return up to size bytes. If the argument is omitted, None, or negative, data is read and returned until EOF is reached. An empty bytes object is returned if the stream is already at EOF.

    If the argument is positive, and the underlying raw stream is not interactive, multiple raw reads may be issued to satisfy the byte count (unless EOF is reached first). But for interactive raw streams, at most one raw read will be issued, and a short result does not imply that EOF is imminent.

    A BlockingIOError is raised if the underlying raw stream is in nonblocking-mode, and has no data available at the moment.

    (Emphasis mine)

    And here's a demonstration:

    1. Python3's stdout is block-buffered unless connected to a terminal.

      $ python3 -c 'import time; print( "hi" ); time.sleep( 3 )' |
         cat
      <3 second pause due to lack of flushing>
      hi
      
    2. But we can flush a handle (or disable its buffering).

      $ python3 -c 'import time; import sys; print( "hi" ); sys.stdout.flush(); time.sleep( 3 )'
         cat
      <no pause>
      hi
      <3 second pause while cat wait for program to finish>
      
    3. Since the output isn't buffered, the pause from the following means read received the text and it's waiting for more.

      $ python3 -c 'import time; import sys; print( "hi" ); sys.stdout.flush(); time.sleep( 3 )' |
         python3 -c 'import sys; print( sys.stdin.read( 10 ), end="" );'
      <3 second pause due to read waiting for more bytes>
      hi
      
    4. We can also observe the pause going away if you reduce the requested number of chars to the what's produced.

      $ python3 -c 'import time; import sys; print( "hi" ); sys.stdout.flush(); time.sleep( 3 )' |
         python3 -c 'import sys; print( sys.stdin.read( 3 ), end="" );'
      <no pause>
      hi
      <3 second pause while shell waits for pipeline to finish>
      

    Sorry, I don't know Python well enough to provide a solution.