pythonmemorysubprocessstdout

Measuring maximum memory while capturing stdout in Python using subprocess


Is there a clean way to measure the maximum memory consumption of a subprocess while still capturing stdout (and ideally setting a timeout) using subprocess in Python?

Capturing output and setting a timeout can easily done using functions of subprocess:

output = subprocess.run(cmd, capture_output=True, timeout=100)

Measuring the maximum allocated memory seems to require polling, i.e. using psutil as in this example: Subprocess memory usage in python. But this way, capturing stdout would also need to be implemented, which gets messy quickly (subprocess starts a new thread for reading stdout)


Solution

  • You can measure the max memory consuption of a subprocess and capture stdout by combining subprocess and psutil.

    Use subprocess.Popen to start the subprocess. This way you can interact with the subprocess as it runs. Periodically check the memory usage with psutil.

    Here's an example implementation I have from a previous project:

    import subprocess
    import psutil
    import time
    
    def run_command_with_memory_tracking(cmd, timeout=None):
        process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        psutil_process = psutil.Process(process.pid)
        
        max_memory = 0
        start_time = time.time()
        output, errors = "", ""
    
        # polling loop
        while True:
            # measure memory
            current_memory = psutil_process.memory_info().rss
            max_memory = max(max_memory, current_memory)
            
            if timeout and (time.time() - start_time) > timeout:
                process.kill()
                raise TimeoutError(f"Command '{' '.join(cmd)}' timed out after {timeout} seconds")
    
            if process.poll() is not None:
                break
    
            time.sleep(0.1)
        
        output, errors = process.communicate()
    
        return {
            "max_memory": max_memory,
            "stdout": output,
            "stderr": errors
        }
    
    # Test
    cmd = ["echo", "Hello, World!"]
    result = run_command_with_memory_tracking(cmd, timeout=10)
    
    print(f"Max memory used: {result['max_memory']} bytes")
    print(f"Output:\n{result['stdout']}")
    print(f"Errors:\n{result['stderr']}")