pythoncompressionzippython-2.6

why can't python execute a zip archive passed via stdin?


I have a zip archive containing a __main__.py file : archive.zip

I can execute it with

python archive.zip
=> OK !

but not with

cat archive.zip | python
=> File "<stdin>", line 1
SyntaxError: Non-ASCII character '\x9e' in file <stdin> on line 2,
but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

why is there a difference between the 2 modes and is there a way to make the pipe work without unzipping outside of python ?

I receive this archive over the network and want to execute it as soon as i receive it and as fast as possible so I thought that piping the zip into python would work !


Solution

  • The reason that you can 'python file.zip', but not 'cat file.zip | python' is that Python has the 'zipimport' built in so that when you run python against files (or try to import them), zipimport takes a crack at them as part of the import process. (See the import module for details).

    But with stdin, python does not make any attempt to search the streaming data - because the streaming data could be anything - could be user input that is handled by code, could be code. There's no way to know and Python makes no real effort to know for that reason.

    edit

    Occasionally, when you're answering questions - you think 'I really shouldn't tell someone the answer', not because you wish to be secretive or hold some amount of power over them. Simply because the path they're going down isn't the right path and you want to help them out of the hole they're digging. This is one of those situations. However, against my better judgement, here's an extremely hacky way of accomplishing something similar to what you want. It's not the best way, it's probably in fact the worst way to do it.

    I just played around with the zipimporter for a while and tried all the tricks I could think of. I looked at 'imp', 'compile' as well.. Nothing can import a zipped module (or egg) from memory so far that I can see. So, an interim step is needed.

    I'll say this up front, I'm embarrassed to even be posting this. Don't show this to people you work with or people that you respect because they laugh at this terrible solution.

    Here's what I did:

    mkdir foo
    echo "print 'this is foo!'" >>foo/__init__.py
    zip foo.zip -r foo
    rm -rf foo                   # to ensure it doesn't get loaded from the filesystem
    mv foo.zip somethingelse.zip # To ensure it doesn't get zipimported from the filesystem
    

    And then, I ran this program using

    cat somethingelse.zip | python script.py

    #!/usr/bin/python 
    
    import sys
    import os
    import zipfile
    import StringIO
    import zipimport
    import time
    
    sys.path.append('/tmp')
    
    class SinEater(object):
        def __init__(self):
            tmp = str(int(time.time()*100)) + '.zip'
            f = open(tmp, 'w')
            f.write(sys.stdin.read(1024*64)) # 64kb limit
            f.close()
            try:
                z = zipimport.zipimporter(tmp)
                z.load_module('foo')
    
            except:
                pass
    
    if __name__ == '__main__':
        print 'herp derp'
        s = SinEater()
    

    Produces:

    herp derp
    this is new
    

    A solution that would be about a million times better than this would be to have a filesystem notification (inotify, kevent, whatever windows uses) that watches a directory for new zip files. When a new zip file is dropped in that directory, you could automatically zipimport it. But, I cannot stress enough even that solution is terrible. I don't know much about Ansible (anything really), but I cannot imagine any engineer thinking that it would be a good solution for how to handle code updates or remote control.