c++python-3.xpopen

Why does using popen() in C++ to execute an external command that downloads a file greater than 20GB result in program exiting downloading only 11GB?


I am trying to execute an external command (this command downloads a file from server using python sockets) in c++ using popen(). But after it has downloaded approximately 11640016384 bytes of that data it terminates the child process and exits.

The size of original data is approximately 20GB.

Please see the c++ code below:

#include <cstdlib>
#include <iostream>
#include <string>
using namespace std;

int main()
{
    cout<< popen("getdata -g <dataName>", "r") << endl;
} 

Upon Execution

userID@Hostname$ g++ execute_system_popen.cpp -o execute_system_popen
userID@Hostname$ ./execute_system_popen 
0x15aeeb0
userID@Hostname$ ls -Traceback (most recent call last):
  File "<string>", line 1, in <module>
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='ANSI_X3.4-1968'>
BrokenPipeError: [Errno 32] Broken pipe
l fileName.tar 
-rw-r-----. 1 userID xxx **11640016384** Jan 26 23:04 fileName.tar
userID@Hostname$ 

However if I choose to do the same using system() the file gets downloaded completely.

#include <cstdlib>
#include <iostream>
#include <string>
using namespace std;

int main()
{
    cout<< system("getdata -g <dataName>") << endl;
}

Solution

  • During the initial analysis of automation scripts, it was observed that the scripts contain multiple references to trigger getdata using both system() and popen() functions in C++.

    While system() worked as expected, popen() terminated the child process around 11 GB.

    Further analysis revealed that the popen() library uses a "pipe buffer" with a limited size, causing the termination.

    Root Cause:

    The frequent print statements (every 50 MB) from getdata were filling up the pipe buffer, leading to the termination of the child process.

    Solution:

    To resolve the issue, the number of print statements inside getdata was reduced. This solution significantly reduces the number of print statements, preventing the pipe buffer from filling up and allowing the download to complete successfully.