pythonlistyamlcomparisonfile-put-contents

How to check contents of one list with multiple lists


Here is a list created from YAML file. I want to compare the following contents of each memservers with other merserves in the list:

  1. If the rpc_interface is same.
  2. If rpc_interface is same, is fam_path same?
  3. If rpc_interface is same, is libabric_port same?

The code should be generic i.e., it should work for any number of servers.

provider: sockets
delayed_free_threads: 0
ATL_threads: 0
ATL_queue_size: 1000
ATL_data_size: 1024
Memservers:
  0:
    memory_type: volatile
    fam_path: /dev/shm/vol_path
    rpc_interface: fam5:8793
    libfabric_port: 7500
    if_device: eth0
  1:
    memory_type: volatile
    fam_path: /dev/shm/vol_path
    rpc_interface: fam4:8793
    libfabric_port: 7500
    if_device: eth1
  2:
    memory_type: volatile
    fam_path: /dev/shm/vol_path
    rpc_interface: fam3:8793
    libfabric_port: 7500
    if_device: eth1

Solution

  • For each of your requirements, you should just map the value (that should not duplicate) to the machine numbers that you found them in. In the following just done for the first requirement:

    from pathlib import Path
    import ruamel.yaml
    
    host_port = {}
    file_in = Path('fam_memoryserver_config.yaml')
    
    yaml = ruamel.yaml.YAML(typ='safe')  # faster than using yaml.safe_load()
    data = yaml.load(file_in)
    for machine_nr, config in data.items():
        host_port.setdefault(config['rpc_interface'], set()).add(machine_nr)
    
    # now check if host_port has any values that have more than one machine_nr
    for hp, machine_nrs in host_port.items():
        if len(machine_nrs) == 1:
            continue
        print(f'found {hp} in machines: {", ".join([str(x) for x in machine_nrs])}')
    

    which gives:

    found fam3:8793 in machines: 1, 2
    

    You should get your terminology clear, or at least define these in your question. Common terminology will help you find answers yourself more easily (here on SO, or googling).

    You don't have any lists, you have a mapping at the root level of your YAML document, with mapping for the values of each key. Those load to dicts nested within a dict. there are no lists anywhere.

    The above also assumes that the value for key rpc_interfaces is what you refer to as ip:port, however that part before the : in fam5:8793 doesn't look like an IPv4 or an IPv6 address. It looks more like a hostname.

    You also refer to checks between all files, but there are only two files: the YAML input and your source code. And comparing between those doesn't make much sense.