pythonlmdb

How to correctly insert duplicate keys in lmdb?


According to the documentation (https://lmdb.readthedocs.org/en/release/), passing dupsort to open_db() should allow duplicate keys to be added to an lmdb database. But that seems to not be the case and it's still overwriting the values, unless I'm reading the documentation wrong.

env = lmdb.open(path.join(directory, 'lmdb'), map_size=map_size)
db = env.open_db(dupsort=True)

with env.begin(db=db, write=True) as transaction:
    transaction.put(b'mykey', b'value1')
    transaction.put(b'mykey', b'value2')
    transaction.put(b'mykey', b'value3')

However, when I iterate through the key values, it only shows the last value "value3".

cursor = transaction.cursor()
for key, value in cursor.iternext(True, True):
    print(key, value)

iternext_dup() also doesn't seem to be printing out the expected values. I also tried cursor.next() and it only return True once, additionally transaction.stat() shows entries: 1.


Solution

  • I found out what was wrong. The documentation wasn't very clear, and it seems dupsort doesn't work on the default database, and you need to create a new one via open_db().

    The state of the flags of the default database are all false, and there's no way to change the persistent state of the flags, so there's no way to do dupsort for the default database.

    E.g

    env = lmdb.open(path, max_dbs=2)
    # doing just env.open_db(dupsort=True) doesn't work
    db = env.open_db('db2', dupsort=True)
    ...