python-3.xpostgresqlstdincsvkit

How to pass string via STDIN into terminal command being executed within python script?


I need to generate postgres schema from a dataframe. I found csvkit library to come closet to matching datatypes. I can run csvkit and generate postgres schema over a csv on my desktop via terminal through this command found in docs:

csvsql -i postgresql myFile.csv 

csvkit docs - https://csvkit.readthedocs.io/en/stable/scripts/csvsql.html

And I can run the terminal command in my script via this code:

import os
a=os.popen("csvsql -i postgresql Desktop/myFile.csv").read()

However I have a dataframe, that I have converted to a csv string and need to generate schema from the string like so:

csvstr = df.to_csv()

In the docs it says that under positional arguments:

The CSV file(s) to operate on. If omitted, will accept
                        input on STDIN

How do I pass my variable csvstr into the line of code a=os.popen("csvsql -i postgresql csvstr").read() as a variable?

I tried to do the below line of code but got an error OSError: [Errno 7] Argument list too long: '/bin/sh':

a=os.popen("csvsql -i postgresql {}".format(csvstr)).read()

Thank you in advance


Solution

  • You can't pass such a big string via commandline! You have to save the data to a file and pass its path to csvsql.

    import csv
    
    csvstr = df.to_csv()
    with open('my_cool_df.csv', 'w', newline='') as csvfile:
        csvwriter= csv.writer(csvfile)
        csvwriter.writerows(csvstr)
    

    And later:

    a=os.popen("csvsql -i postgresql my_cool_df.csv")