My goal is to have a python script ("s1.py"), which is running locally, initiate a second python script ("s2.py") via ssh on a remote machine. As part of starting s2.py, s1.py would deliver a python dictionary which s2.py would read into memory and subsequently refer to as part of the work it has to do. The dictionary comes from a local file which I want to have remain locally for security reasons, hence the idea of sending it over the wire and into memory (I could, for example, copy the file to the remote machine to a tmpfs space, read it and immediately shred the tmpfs; this would all happen pretty quickly but I still see it as bad practice if I can send the data without having to send the file).
The approach I've been taking is based on the assumption that I can pass this dictionary, via ssh, as a python object from s1.py to s2.py in the form of a command line item, as well as some other command line flags that are required. Something like:
ssh user@remote python3 s2.py python-dict-obj flag1 flag2
S2.py would then read in the object (preferably as a dictionary) in some way or other, e.g.
useful_dict = sys.argv[1], var1_flag1=sys.argv[2], var2_flag2=sys.argv[2]
As I mentioned, the dictionary comes from a local file which can equally be stored in yaml or json (yaml is preferable). Therefore s1.py can create the dictionary using standard python modules:
with open("local_file", "r", encoding="utf-8") as read_file:
dict_from_file = json.load(read_file)
with open("local_file", 'r') as stream:
dict_from_file = yaml.safe_load(stream)
S1.py then attempts, via ssh, to initiate s2.py and send the dictionary, for example using subprocess module:
subprocess.run(['ssh', 'user@remote', 'python3', '/user/scripts/s2.py', 'dict_from_file', 'flag1', 'flag2'])
Does the problem arise because I'm trying to send a dictionary over ssh? Can "dict_from_file" survive as a python object passing through the subprocess module and being sent to the environment/shell of the remote machine? Same when using Paramiko or Fabric?
To date, anything I have tried seems to have s2.py rejecting the data. This would seem to suggest that it may simply not be possible to pass a dictionary in the way I want to. To be fair, I haven't tried anything more adventurous than having s2.py accept inputs at the command line using the sys module so perhaps there is a way to ingest the data in a different way.
Even if it's not possible to transmit an actual dictionary object over ssh, surely the raw data can be sent in json format? A json string (json.dumps?) perhaps although I don't have experience with this? Then, presumbly s2.py can read the string and reconstruct a dictionary?
Or, is my approch just too naive? Do I need to explore sockets/pipes/multiprocesses to achieve what I'm trying to do. On a basic level I'm just trying to perform some simple (I think) data transmission so I'm hoping that there's a brief, elegant way of doing this. Any and all suggestions are welcome.
This works for me. However, there are some caveats. This will NOT work if the JSON contains any strings with single quotes. Fixing that, unfortunately, leads to a maze of twisty passages.
import sys
import json
import subprocess
if len(sys.argv) > 1:
x = json.loads(sys.argv[1])
print('SUBPROCESS')
print(sys.argv[1],flush=True)
print(x)
else:
s = json.load(open('x.json'))
cmd = [ 'ssh', 'timr@localhost', 'python', __file__, "'"+json.dumps(s)+"'" ]
print("MAIN")
subprocess.run(cmd)
Output:
MAIN
SUBPROCESS
{"id": "123ab", "operation": "foo", "metadata": {"source": "ddb", "destination": "s3"}, "data": "02-02-2024"}
{'id': '123ab', 'operation': 'foo', 'metadata': {'source': 'ddb', 'destination': 's3'}, 'data': '02-02-2024'}
If you DO have more complicated input, you can feed your text in via stdin:
import sys
import json
import subprocess
if len(sys.argv) > 1:
print('SUBPROCESS')
x = sys.stdin.read()
print('file',x)
y = json.loads(x)
print('json',y)
else:
s = json.load(open('x.json'))
cmd = [ 'ssh', 'timr@localhost', 'python', __file__, 'x']
print("MAIN")
subprocess.run(cmd,input=json.dumps(s).encode())
Output:
SUBPROCESS
file {"id": "123ab", "operation": "foo", "metadata": {"source": "ddb", "destination": "s3"}, "data": "02-02-2024"}
json {'id': '123ab', 'operation': 'foo', 'metadata': {'source': 'ddb', 'destination': 's3'}, 'data': '02-02-2024'}
MAIN