pythonserializationredischild-processvalkey

An object with a Redis object in its attribute can't be serialized. How to modify it for serialization without delaying the connection?


When writing a multi-process program for web scraping, I passed an object of my custom  RedisQueue  class as an argument to the multi-process task function. This class contains a Redis object as one of its attributes, and when I ran the program, the following error occurred: "TypeError: can't pickle StrictRedis objects".

The RedisQueue class I wrote is as follows (in Python):

from redis import StrictRedis


class RedisQueue:
    def __init__(self, host: str = "localhost", port: int = 6379, password: str = None, db: int = 0,
                 client: StrictRedis = None,name_in_redis: str = "rq"):
        if client is None:
            self.client = StrictRedis(host=host, port=port, password=password, db=db)
        else:
            self.client = client
        self.queue_name = name_in_redis

    def __len__(self):
        return self.client.llen(self.queue_name)

    def clear(self):
        self.client.delete(self.queue_name)

    def push(self, *values: str):
        return self.client.lpush(self.queue_name,*values)

    def pop(self):
        return self.client.rpop(self.queue_name)

    def close(self):
        if self.client is not None:
            self.client.close()

tried

I have tried recreating an instance of the RedisQueue class within the process task function to solve this issue, but this complicates the process of passing parameters to the process task function—requiring me to pass nearly all parameters needed for creating the Redis object.

expecting

I hope to rewrite the RedisQueue class so that its instances can be directly passed as parameters to the task function, eliminating the need to retrieve the required parameters and recreate the instance in the child process.


Solution

  • Clearly self.client is trouble, here.

    You need to define something new, perhaps RedisProtoQueue, which knows how to make a Valkey client but doesn't actually contain an open TCP connection. Give it the user / pass / host connect string credentials, but not the open socket. Then once it is deserialized on some destination host it will be able to connect to Valkey as needed.

    If you wish to serialize an object which has already opened a TCP connection, simply set that attribute to None, since that's easily serialized. Then re-create the TCP connection on the far side, after deserializing.