pythonsortingnetworkxhierarchicaltopological-sort

How to combine hierarchical lists into one while respecting the hierarchy of each one of them using networkx and python?


For a medical application I want to combine several lists into one list. The items in each list represent actions to be taken for a specific disease. For instance the list Malaria would look like that:

  1. 'Give Malaria medication'
  2. 'Give antibiotics'
  3. 'Tell patient to come back tomorrow'
  4. 'Give a warm lemon tea'

A second list for Bacterial sore throat would be this:

  1. 'Give antibiotics'
  2. 'Give paracetamol'
  3. 'Give a warm lemon tea'
  4. 'Warm the patient'

The actions in each list are hierarchical. That means that the more important tasks are mentioned first.

The lists themselves have a hierarchy as well, so Malaria is higher than Bacterial sore throat.

I need to combine those lists into a global one, that has the items sorted in a way that both hierarchies are preserved. This would be the case that a patient has both Malaria AND Bacterial sore throat and receives treatment according to the importance of each action.

For this case I would want this list:

  1. 'Give Malaria medication' (from Malaria list because it is higher than sore throat)
  2. 'Give antibiotics' (covers an action from both lists)
  3. 'Tell patient to come back tomorrow' (from Malaria list)
  4. 'Give Paracetamol' (more important than warm tea as seen in sore throat)
  5. 'Give a warm lemon tea' (covers an action from both lists)
  6. 'Warm the patient'

Currently this sorting is done by hand, but it becomes too complex with 50+ diseases. I have looked into networkx trying to do a topological sort but I guess this is not the right approach.

It becomes more complex when sorting is not possible and the actions are in reverse order. For instance in Diarrhoea there are

while in Severe Diarrhoea these are in reverse order

In this case I want the solution to double an item in the global list to

Is there a way to solve at least one of those steps?


Solution

  • I agree with @Corralien that once you have a DAG, topological sort is still the way to go. You just have to break the ties between elements that are at the same hierarchy level correctly. This can be done with nx.lexicographical_topological_sort.

    import networkx as nx
    
    malaria = ['give malaria medication', 'give antibiotics', 'tell patient to come back tomorrow', 'give warm lemon tea']
    bacterial_sore_throat = ['give antibiotics', 'give paracetamol', 'give warm lemon tea', 'warm patient']
    
    edges = list(zip(malaria[:-1], malaria[1:])) + list(zip(bacterial_sore_throat[:-1], bacterial_sore_throat[1:]))
    graph = nx.from_edgelist(edges, nx.DiGraph)
    
    def sort_function(item):
        if item in malaria:
            return '01'
        elif item in bacterial_sore_throat:
            return '02'
        else:
            raise ValueError
    
    print(list(nx.lexicographical_topological_sort(graph, sort_function)))
    
    # ['give malaria medication', 'give antibiotics', 'tell patient to come back tomorrow', 'give paracetamol', 'give warm lemon tea', 'warm patient']