phppython-3.xpandasmatplotlib

Possible import error for pandas and matplotlib when calling Python3 via PHP shell_exec


I am experiencing a strange behaviour when trying to execute two different Python 3 scripts via PHP shell_exec() function. The first Python script is called the following way in PHP:

$jsondatash = escapeshellarg($jsondata);
// Execute the python script with the JSON data
$resultpy = shell_exec("/usr/bin/python3 /var/www/testsite/py/script1.py 2>&1 $jsondatash");

I escape some json data ($jsondatash) and run a Python function (script1.py) on this data via shell_exec(). The result of the Python workflow is stored in the PHP array $resultpy. The Python test script looks like this:

from collections import OrderedDict
import sys, json
import scipy
import scipy.cluster.hierarchy as sch
import pandas as pd
import matplotlib


# Load the data that PHP sent
try:
     data = json.loads(sys.argv[1], object_pairs_hook=OrderedDict) 

except (ValueError, TypeError, IndexError, KeyError) as e:
     print (json.dumps({'error': str(e)}))
     sys.exit(1)

print (json.dumps(data))

It is just reading the data and sending it back to PHP via shell. As a result, however, I get an empty object back, indicating that the Python 3 code cannot be executed. This is the Apache2 error.log:

PHP Fatal error: Uncaught TypeError: array_keys(): Argument #1 ($array) must be of type array, null given 

No Python-specific error information, only the error for the empty array in PHP. When I comment out both pandas and matplotlib from the Python test script

from collections import OrderedDict
import sys, json
import scipy
import scipy.cluster.hierarchy as sch
#import pandas as pd
#import matplotlib


# Load the data that PHP sent
try:
     data = json.loads(sys.argv[1], object_pairs_hook=OrderedDict) 

except (ValueError, TypeError, IndexError, KeyError) as e:
     print (json.dumps({'error': str(e)}))
     sys.exit(1)

print (json.dumps(data))

, I can successfully run the script via shell_exec(). So there must be some sort of loading error for both Python 3 libraries when executing the Python script via PHP.

The same behaviour I am experiencing when writing the json data in PHP to a file (inputfile.json), reading the file in Python, writing the data in Python to another file (outputfile.json), and reading this file back into PHP. This is the PHP code:

$jsondata = fopen('/var/www/testsite/files/inputfile.json', 'w');
fwrite($jsondata, $data);
fclose($jsondata);    

shell_exec("/usr/bin/python3 /var/www/testsite/py/script2.py 2>&1");
$resultpy = file_get_contents('/var/www/testsite/files/outputfile.json');

This is the corresponding script2.py Python code:

from collections import OrderedDict
import sys, json
import scipy
import scipy.cluster.hierarchy as sch
import pandas as pd
import matplotlib

# Load the data from json file
try:
   with open('/var/www/testsite/files/inputfile.json', 'r') as inputfile:
   data = json.load(inputfile, object_pairs_hook=OrderedDict) 
except (ValueError, TypeError, IndexError, KeyError) as e:
   print (json.dumps({'error': str(e)}))
   sys.exit(1)

# write data to json outputfile
with open('/var/www/testsite/files/outputfile.json', 'w') as outputfile:
json.dump(data, outputfile)

When executing the Python code directly in shell

/usr/bin/python3 /var/www/testsite/py/script2.py 2>&1

, I get the exspected output, but when executing the Python code in PHP using shell_exec(), the output file outputfile.json is empty. Again, if I remove both pandas and matplotlib from the Python script, it also runs successfully via PHP shell_exec().

These tests show the following: (1) Both Python scripts can be executed via PHP shell_exec(), but only if I remove loading pandas and matplotlib from the Python scripts. (2) Loading pandas and matplotlib is not a problem per se, since I can execute the Python script2.py including both libraries directly in shell and get the exspected output.

Why can I run the excact same python code on Apache2 directly in shell, but not via shell_exec() in PHP? From my test, it looks like it has to do with both pandas and matplotlib failing to load and terminating the python scripts.

Edit: Python3 version is 3.8.10, PHP version is 8.2.8, and this is the output after installing pandas and matplotlib:

pip install pandas
> Successfully installed pandas-2.0.3 python-dateutil-2.8.2 pytz-2023.3 tzdata-2023.3
pip install matplotlib
> Successfully installed contourpy-1.1.0 cycler-0.11.0 fonttools-4.40.0 importlib-resources-6.0.0
> kiwisolver-1.4.4 matplotlib-3.7.2 packaging-23.1 pillow-10.0.0 pyparsing-3.0.9 zipp-3.15.0

Edit2: I was wondering if scipy, which can be loaded and pandas, which fails to load, are located in the same path, and indeed they are:

Name: scipy
Version: 1.10.1
Location: /home/ubuntu/.local/lib/python3.8/site-packages

Name: pandas
Version: 1.5.3
Location: /home/ubuntu/.local/lib/python3.8/site-packages

I tested an older version of pandas (v.1.5.3 - see above), and have now installed the current version v.2.0.3. Both terminate the python script when imported. All additional packages required by pandas are installed.

Name: pandas
Version: 2.0.3
Location: /home/ubuntu/.local/lib/python3.8/site-packages

Python 3 sys.path is:

['', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/home/ubuntu/.local/lib/python3.8/site-packages', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages']

Edit3: I found that the following method (https://stackoverflow.com/a/44128762/8008652) allows me to retrieve the Python 3 error in PHP and print it in the html output:

exec('/usr/bin/python3 /var/www/testsite/py/script2.py 2>&1', $output, $return_var);
  if ($return_var>0) {
    var_dump($output);
  }

This is the Python 3 error:

 array(14) { [0]=> string(34) "Traceback (most recent call last):" [1]=> string(77) " File "/var/www/testsite/py/script2.py", line 6, in " [2]=> string(23) " import pandas as pd" [3]=> string(80) " File "/usr/lib/python3/dist-packages/pandas/__init__.py", line 55, in " [4]=> string(33) " from pandas.core.api import (" [5]=> string(79) " File "/usr/lib/python3/dist-packages/pandas/core/api.py", line 5, in " [6]=> string(44) " from pandas.core.arrays.integer import (" [7]=> string(92) " File "/usr/lib/python3/dist-packages/pandas/core/arrays/__init__.py", line 10, in " [8]=> string(53) " from .interval import IntervalArray # noqa: F401" [9]=> string(92) " File "/usr/lib/python3/dist-packages/pandas/core/arrays/interval.py", line 38, in " [10]=> string(60) " from pandas.core.indexes.base import Index, ensure_index" [11]=> string(89) " File "/usr/lib/python3/dist-packages/pandas/core/indexes/base.py", line 74, in " [12]=> string(49) " from pandas.core.strings import StringMethods" [13]=> string(139) "ImportError: cannot import name 'StringMethods' from 'pandas.core.strings' (/usr/lib/python3/dist-packages/pandas/core/strings/__init__.py)" } 

As I suspected from my previous tests, it's an import error related to the pandas library (ImportError: cannot import name 'StringMethods' from 'pandas.core.strings).


Solution

  • I found that Python libraries were installed in two different paths on the server: One was '/usr/local/lib/python3.8/dist-packages', and the other one was '/usr/lib/python3/dist-packages'.

    When I executed the scripts directly from the command line on the server, it accessed the Pandas package from '/usr/local/lib/python3.8/dist-packages', whereas it was loading Pandas out of PHP using shell_exec() from '/usr/lib/python3/dist-packages'. This means it did load different versions of Pandas when executing the Python script either from Shell or out of PHP.

    So, executing the same Python3 scripts from command line or out of PHP did import the packages from different directories.

    I did remove Python3 and both package directories, reinstalled Python3 and both Pandas and Matplotlib packages, and that solved the problem for me.