pythonredispy-redis

Redis-py Read LOWEST_LATENCY or NEAREST


Redis-py Version: 4.2.0

How can we enable redis-py to read from the LOWEST_LATENCY node? We are using AWS global datastore so we want to enable redis-py to read from the nearest node from the geo-distributed nearest node?


Solution

  • I think you can't.

    The Python library, according to this Amazon article, integrated cluster support originally implemented in this project, so users did not need a 3rd party library for cluster support. However, at the time of writting there is no support for lowest latency mode. The supported modes are PRIMARIES, REPLICAS, ALL_NODES and RANDOM. Note that some of these are covered in the Java library, such as REPLICAS, that I think is ANY_REPLICA in Lettuce.

    The LOWEST_LATENCY mode is implemented in the Java library, it does not seem to be a parameter supported or provided by AWS and from what I could see, it is not present in the Python library.

    /**
     * Setting to read from the node with the lowest latency during topology discovery. Note that latency measurements are
     * momentary snapshots that can change in rapid succession. Requires dynamic refresh sources to obtain topologies and
     * latencies from all nodes in the cluster.
     * @since 6.1.7
     */
    public static final ReadFrom LOWEST_LATENCY = new ReadFromImpl.ReadFromLowestCommandLatency();
    

    This apparently was called NEAREST, which might mean it was somehow initially linked to geospacial proximity (but this is just a guess):

    /**
     * Setting to read from the node with the lowest latency during topology discovery. Note that latency measurements are
     * momentary snapshots that can change in rapid succession. Requires dynamic refresh sources to obtain topologies and
     * latencies from all nodes in the cluster.
     * @deprecated since 6.1.7 as we're renaming this setting to {@link #LOWEST_LATENCY} for more clarity what this setting
     *             actually represents.
     */
    @Deprecated
    public static final ReadFrom NEAREST = LOWEST_LATENCY;
    

    Looking at ReadFromLowestCommandLatency(), here's the definition. Comment has important information about the latency mesurements:

    /**
     * Read from the node with the lowest latency during topology discovery. Note that latency measurements are momentary
     * snapshots that can change in rapid succession. Requires dynamic refresh sources to obtain topologies and latencies from
     * all nodes in the cluster.
     */
    static final class ReadFromLowestCommandLatency
    

    All these Readxxx methods somehow use getNodes(), which returns nodes sorted by latency. But this sorting occurs in the library. The library seems to implement the latency sorting, which is not done in the Python implementation.

    // Returns the list of nodes that are applicable for the read operation. The list is ordered by latency.
    List<RedisNodeDescription> getNodes();
    

    I did not perform a full code analysis, but for example this method in TopologyComparators.java seems to confirm this:

    /**
     * Sort partitions by latency.
     * @param clusterNodes
     * @return List containing {@link RedisClusterNode}s ordered by latency
     */
    public static List<RedisClusterNode> sortByLatency(Iterable<RedisClusterNode> clusterNodes) {
        List<RedisClusterNode> ordered = LettuceLists.newList(clusterNodes);
        ordered.sort(LatencyComparator.INSTANCE);
        return ordered;
    }
    

    Sorry if you already know some of this, but as I took a look I thought I would post it as an answer.