Collecting multiply return values in a recursive function call (automaton nfa processing)

I'm writing an NFA (nondeterministic finite automaton) class, which should parse a given input and returns all possible traces (paths from the initial to the final state). Ultimately, I want to calculate the degree of ambiguity of a given automaton.

Unfortunately, I'm not able to collect the return of the method properly. This version of the code returns None and a slightly modified one using yield returns only the one, the first, path.

This question seems somewhat vague, but I hope someone can give me a hint in the right direction.

Thanks in advance.

class NFA(object):
    __slots__ = [
        'states',
        'alphabet',
        'transitions',
        'initial_state',
        'final_states',
    ]

    #
    #
    #  -------- Init -----------
    #
    def __init__(
            self,
            states: set,
            alphabet: set,
            transitions: dict,
            initial_state: str,
            final_states: set,
    ):
        """
            Initialize a complete automaton.
        """

        self.states = states
        self.alphabet = alphabet
        self.transitions = transitions
        self.initial_state = initial_state
        self.final_states = final_states

    #
    #
    #  -------- process -----------
    #
    def process(self, word: str, trace: list = []) -> list:
        """
        Check if the given string is accepted by this automaton.
        Return all accepting paths.
        """

        # processing finished returning the trace
        if (not word):
            print(trace)
            return trace

        # start processing with initial state
        if (not trace):
            trace.append(self.initial_state)

        # get the current state transitions
        state_transition: dict = self.transitions.get(trace[-1], None)

        # input not accepted
        if (not state_transition):
            return False

        # iterate over each possible transition
        for state in state_transition.get(word[0], []):

            # create new sub trace, append current state
            sub_trace: list = trace.copy()
            sub_trace.append(state)

            # start recursive function call
            self.process(word[1:], trace=sub_trace)

from automata.nfa import NFA

config: dict = {
    'states': ['q0', 'q1', 'q2'],
    'alphabet': ['a'],
    'transitions': {
        'q0': {
            'a': ['q1', 'q2']
        },
        'q1': {
            'a': ['q1']
        }
    },
    'initial_state': 'q0',
    'final_states': ['q1'],
}

testNFA = NFA(**config)

assert testNFA.process("a") == [['q0', 'q1'], ['q0', 'q2']]

Solution

It sounds like you're asking for a basic DFS, essentially. I think this design is confusing a few things.

If you return something from a recursive call, the data only goes back to its immediate caller and needs to be passed all the way back up the call chain to the original scope. If you don't want to use a generator, you probably need some sort of master result list parameter that you append a copy of trace to when you hit a base case where word has been consumed and the NFA is in an accepting state. An inner function can be useful for this to avoid lots of default arguments exposed to the caller.

A likely better approach is to use a generator which obviates dealing with managing a result list and passing return values around and lets the caller control how to consume the result.

You mention you attempted a generator version, but recursive generators use a yield from to be applied to the recursive call.

With that approach in mind, the code you show doesn't take into account final_states, so your expected output seems incorrect based on the problem description of "paths from the initial to the final state". I'd anticipate [['q0', 'q1']] to be the desired output since "q2" is not an accepting state. But deciding between the two is trivial as you'll see in my code snippet below.

As an implementation detail, slicing a string per recursive call is O(n). You can use an index to keep track of your location in a single string shared among all recursive call frames to avoid this cost. Similarly, copying the trace list on every call is slower than keeping one list to push/pop from and only copying it upon yield.

Be careful about the mutable default argument.

Here's a minimal, complete example:

from collections import namedtuple

def find_accepting_paths(nfa, word, trace=None):
    trace = trace if trace else [nfa.initial_state]

    if word:
        for state in nfa.transitions.get(trace[-1], {}).get(word[0], []):
            trace.append(state)
            yield from find_accepting_paths(nfa, word[1:], trace)
            trace.pop()
    elif trace[-1] in nfa.final_states: # or use `else` if you're sure you
                                        # don't care about accepting states
        yield trace[:]

if __name__ == "__main__":
    NFA = namedtuple("NFA", "transitions initial_state final_states")
    nfa = NFA(
        transitions={
            "q0": {
                "a": ["q1", "q2"]
            },
            "q1": {
                "a": ["q1"]
            }
        },
        initial_state="q0",
        final_states=["q1"],
    )
    print(list(find_accepting_paths(nfa, "a"))) # => [['q0', 'q1']]