pythonpython-2.7dictionary

How to split dictionary into multiple dictionaries fast


I have found a solution but it is really slow:

def chunks(self,data, SIZE=10000):
    for i in xrange(0, len(data), SIZE):
        yield dict(data.items()[i:i+SIZE])

Do you have any ideas without using external modules (numpy and etc.)


Solution

  • Since the dictionary is so big, it would be better to keep all the items involved to be just iterators and generators, like this

    from itertools import islice
    
    def chunks(data, SIZE=10000):
        it = iter(data)
        for i in range(0, len(data), SIZE):
            yield {k:data[k] for k in islice(it, SIZE)}
    

    Sample run:

    for item in chunks({i:i for i in range(10)}, 3):
        print(item)
    

    Output

    {0: 0, 1: 1, 2: 2}
    {3: 3, 4: 4, 5: 5}
    {8: 8, 6: 6, 7: 7}
    {9: 9}