pythonstringappend

How do I append one string to another in Python?


How do I efficiently append one string to another? Are there any faster alternatives to:

var1 = "foo"
var2 = "bar"
var3 = var1 + var2

For handling multiple strings in a list, see How to concatenate (join) items in a list to a single string.

See How do I put a variable’s value inside a string (interpolate it into the string)? if some inputs are not strings, but the result should still be a string.


Solution

  • If you only have one reference to a string and you concatenate another string to the end, CPython now special cases this and tries to extend the string in place.

    The end result is that the operation is amortized O(n).

    e.g.

    s = ""
    for i in range(n):
        s += str(i)
    

    used to be O(n^2), but now it is O(n).

    More information

    From the source (bytesobject.c):

    void
    PyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)
    {
        PyBytes_Concat(pv, w);
        Py_XDECREF(w);
    }
    
    
    /* The following function breaks the notion that strings are immutable:
       it changes the size of a string.  We get away with this only if there
       is only one module referencing the object.  You can also think of it
       as creating a new string object and destroying the old one, only
       more efficiently.  In any case, don't use this if the string may
       already be known to some other part of the code...
       Note that if there's not enough memory to resize the string, the original
       string object at *pv is deallocated, *pv is set to NULL, an "out of
       memory" exception is set, and -1 is returned.  Else (on success) 0 is
       returned, and the value in *pv may or may not be the same as on input.
       As always, an extra byte is allocated for a trailing \0 byte (newsize
       does *not* include that), and a trailing \0 byte is stored.
    */
    
    int
    _PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)
    {
        register PyObject *v;
        register PyBytesObject *sv;
        v = *pv;
        if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {
            *pv = 0;
            Py_DECREF(v);
            PyErr_BadInternalCall();
            return -1;
        }
        /* XXX UNREF/NEWREF interface should be more symmetrical */
        _Py_DEC_REFTOTAL;
        _Py_ForgetReference(v);
        *pv = (PyObject *)
            PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);
        if (*pv == NULL) {
            PyObject_Del(v);
            PyErr_NoMemory();
            return -1;
        }
        _Py_NewReference(*pv);
        sv = (PyBytesObject *) *pv;
        Py_SIZE(sv) = newsize;
        sv->ob_sval[newsize] = '\0';
        sv->ob_shash = -1;          /* invalidate cached hash value */
        return 0;
    }
    

    It's easy enough to verify empirically.

    $ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
    1000000 loops, best of 3: 1.85 usec per loop
    $ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
    10000 loops, best of 3: 16.8 usec per loop
    $ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
    10000 loops, best of 3: 158 usec per loop
    $ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
    1000 loops, best of 3: 1.71 msec per loop
    $ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
    10 loops, best of 3: 14.6 msec per loop
    $ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
    10 loops, best of 3: 173 msec per loop
    

    It's important however to note that this optimisation isn't part of the Python spec. It's only in the cPython implementation as far as I know. The same empirical testing on pypy or jython for example might show the older O(n**2) performance.

    $ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
    10000 loops, best of 3: 90.8 usec per loop
    $ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
    1000 loops, best of 3: 896 usec per loop
    $ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
    100 loops, best of 3: 9.03 msec per loop
    $ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
    10 loops, best of 3: 89.5 msec per loop
    

    So far so good, but then,

    $ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
    10 loops, best of 3: 12.8 sec per loop
    

    ouch even worse than quadratic. So pypy is doing something that works well with short strings, but performs poorly for larger strings.