Distributed lock manager for Python

I have a bunch of servers with multiple instances accessing a resource that has a hard limit on requests per second.

I need a mechanism to lock the access on this resource for all servers and instances that are running.

There is a restful distributed lock manager I found on github: https://github.com/thefab/restful-distributed-lock-manager

Unfortunately there seems to be a min. lock time of 1 second and it's relatively unreliable. In several tests it took between 1 and 3 seconds to unlock a 1 second lock.

Is there something well tested with a python interface I can use for this purpose?

Edit: I need something that auto unlocks in under 1 second. The lock will never be released in my code.

Solution

My first idea was using Redis. But there are more great tools and some are even lighter, so my solution builds on zmq. For this reason you do not have to run Redis, it is enough to run small Python script.

Requirements Review

Let me review your requirements before describing solution.

limit number of request to some resource to a number of requests within fixed period of time.
auto unlocking
resource (auto) unlocking shall happen in time shorter than 1 second.
it shall be distributed. I will assume, you mean that multiple distributed servers consuming some resource shall be able and it is fine to have just one locker service (more on it at Conclusions)

Concept

Limit number of requests within timeslot

Timeslot can be a second, more seconds, or shorter time. The only limitation is precision of time measurement in Python.

If your resource has hard limit defined per second, you shall use timeslot 1.0

Monitoring number of requests per timeslot until next one starts

With first request for accessing your resource, set up start time for next timeslot and initialize request counter.

With each request, increase request counter (for current time slot) and allow the request unless you have reached max number of allowed requests in current time slot.

Serve using zmq with REQ/REP

Your consuming servers could be spread across more computers. To provide access to LockerServer, you will use zmq.

Sample code

zmqlocker.py:

import time
import zmq

class Locker():
    def __init__(self, max_requests=1, in_seconds=1.0):
        self.max_requests = max_requests
        self.in_seconds = in_seconds
        self.requests = 0
        now = time.time()
        self.next_slot = now + in_seconds

    def __iter__(self):
        return self

    def next(self):
        now = time.time()
        if now > self.next_slot:
            self.requests = 0
            self.next_slot = now + self.in_seconds
        if self.requests < self.max_requests:
            self.requests += 1
            return "go"
        else:
            return "sorry"


class LockerServer():
    def __init__(self, max_requests=1, in_seconds=1.0, url="tcp://*:7777"):
        locker=Locker(max_requests, in_seconds)
        cnt = zmq.Context()
        sck = cnt.socket(zmq.REP)
        sck.bind(url)
        while True:
            msg = sck.recv()
            sck.send(locker.next())

class LockerClient():
    def __init__(self, url="tcp://localhost:7777"):
        cnt = zmq.Context()
        self.sck = cnt.socket(zmq.REQ)
        self.sck.connect(url)
    def next(self):
        self.sck.send("let me go")
        return self.sck.recv()

Run your server:

run_server.py:

from zmqlocker import LockerServer

svr = LockerServer(max_requests=5, in_seconds=0.8)

From command line:

$ python run_server.py

This will start serving locker service on default port 7777 on localhost.

Run your clients

run_client.py:

from zmqlocker import LockerClient
import time

locker_cli = LockerClient()

for i in xrange(100):
    print time.time(), locker_cli.next()
    time.sleep(0.1)

From command line:

$ python run_client.py

You shall see "go", "go", "sorry"... responses printed.

Try running more clients.

A bit of stress testing

You may start clients first and server later on. Clients will block until the server is up, and then will happily run.

Conclusions

described requirements are fulfilled
- number of requests is limited
- no need to unlock, it allows more requests as soon as there is next time slot available
- LockerService is available over network or local sockets.
it shall be reliable, zmq is mature solution, python code is rather simple
it does not require time synchronization across all participants
performance will be very good

On the other hand, you may find, that limits of your resource are not so predictable as you assume, so be prepared to play with parameters to find proper balance and be always prepared for exceptions from this side.

There is also some space for optimization of providing "locks" - e.g. if locker runs out of allowed requests, but current timeslot is already almost completed, you might consider waiting a bit with your "sorry" and after a fraction of second provide "go".

Extending it to real distributed lock manager

By "distributed" we might also understand multiple locker servers running together. This is more difficult to do, but is also possible. zmq allows very easy connection to multiple urls, so clients could really easily connect to multiple locker servers. There is a question, how to coordinate locker servers not to allow too many request to your resource. zmq allows inter-server communication. One model could be, that each locker server would publish each provided "go" on PUB/SUB. All other locker servers would be subscribed, and used each "go" to increase their local request counter (with a bit modified logic).