pythonpickleshelveobject-serialization

What is the difference between pickle and shelve?


When is it appropriate to use pickle, and when is it appropriate to use shelve? That is to say, what do they do differently from each other?

From my research, I understood that pickle can turn every Python object into stream of bytes which can be persisted into a file. Then why do we need shelve as well? Isn't pickle faster?


Solution

  • pickle is for serializing some object (or objects) as a single bytestream in a file.

    shelve builds on top of pickle and implements a serialization dictionary where objects are pickled, but associated with a key (some string), so you can load your shelved data file and access your pickled objects via keys. This could be more convenient were you to be serializing many objects.

    Here is an example of usage between the two. (should work in latest versions of Python 2.7 and Python 3.x).

    pickle Example

    import pickle
    
    integers = [1, 2, 3, 4, 5]
    
    with open('pickle-example.p', 'wb') as pfile:
        pickle.dump(integers, pfile)
    

    This will dump the integers list to a binary file called pickle-example.p.

    Now try reading the pickled file back.

    import pickle
    
    with open('pickle-example.p', 'rb') as pfile:
        integers = pickle.load(pfile)
        print(integers)
    

    The above should output [1, 2, 3, 4, 5].

    shelve Example

    import shelve
    
    integers = [1, 2, 3, 4, 5]
    
    # If you're using Python 2.7, import contextlib and use
    # the line:
    # with contextlib.closing(shelve.open('shelf-example', 'c')) as shelf:
    with shelve.open('shelf-example', 'c') as shelf:
        shelf['ints'] = integers
    

    Notice how you add objects to the shelf via dictionary-like access.

    Read the object back in with code like the following:

    import shelve
    
    # If you're using Python 2.7, import contextlib and use
    # the line:
    # with contextlib.closing(shelve.open('shelf-example', 'r')) as shelf:
    with shelve.open('shelf-example', 'r') as shelf:
        for key in shelf.keys():
            print(repr(key), repr(shelf[key]))
    

    The output will be 'ints', [1, 2, 3, 4, 5].