regexautomatadfa

Design DFA accepting binary strings divisible by a number 'n'


I need to learn how to design a DFA such that given any number 'n', it accepts binary strings {0, 1} whose decimal equivalent number is divisible by 'n'.

There will be different DFAs for different 'n', but can somebody give a basic approach that I should follow to proceed with any number 0 < n < 10 .


Solution

  • Below I have written an answer for n equals to 5, but you can apply the same approach to design DFAs for any value of n and any positional number system, like binary, ternary, etc.

    First the definition of 'Complete DFA': A DFA defined on a complete domain in δ: Q × Σ → Q is called 'Complete DFA'. In other words, in a transition diagram of a complete DFA there is no missing edge (i.e. from each state in Q there is one outgoing edge for every language symbol in Σ). Note: Sometimes we define a partial DFA as δ ⊆ Q × Σ→Q (Read: How does “δ:Q × Σ→Q” read in the definition of a DFA).

    Design a DFA accepting binary integers divisible by 'n':

    Step-1: When you divide an integer ω by n then the remainder can be either 0, 1, ..., (n - 2) or (n - 1). If the remainder is 0 then ω is divisible by n, otherwise not. So, in my DFA there will be a state qr that would correspond to a remainder value r, where 0 <= r <= (n - 1), and the total number of states in the DFA is n.
    If after processing a number string ω over Σ, the end state is qr, this implies that ω % n => r (% is the remainder operator).

    In any automaton, the purpose of a state is like a memory element. A state in an automaton stores some information, like a fan's switch that can tell whether the fan is in 'off' or in 'on' state. For n = 5, five states in the DFA correspond to five possible remainders, as follows:

    1. State q0 is reached if the remainder is 0. State q0 is the final state (accepting state). It is also an initial state.
    2. State q1 is reached if the remainder is 1, a non-final state.
    3. State q2 if the remainder is 2, a non-final state.
    4. State q3 if the remainder is 3, a non-final state.
    5. State q4 if the remainder is 4, a non-final state.

    Using above information, we can start drawing the transition diagram TD of five states as follows:

    fig-1
    Figure-1

    So, 5 states for 5 remainder values. If after processing a string ω the end-state becomes q0, then that means the integer represented by the input string is divisible by 5. In the above figure q0 is marked as final state using two concentric circles.
    Additionally, I have defined a transition rule δ:(q0, 0)→q0 as a self loop for symbol '0' at state q0, this is because the integer represented by any string that consist of only '0' is 0, and 0 is a divisible by n.

    Step-2: the TD above is incomplete and can only process strings of '0's. Now we add some more edges so that it can process subsequent number's strings. Check the table below: it shows new transition rules that can be added as a next step:

    Number Binary Remainder(%5) End-state
    One 1 1 q1
    Two 10 2 q2
    Three 11 3 q3
    Four 100 4 q4
    1. To process binary string '1' there should be a transition rule δ:(q0, 1)→q1
    2. Two: the binary representation is '10'; the end-state should be q2, and to process '10', we just need to add one more transition rule δ:(q1, 0)→q2
      Path: →(q0)─1→(q1)─0→(q2)
    3. Three: in binary it is '11'; the end-state is q3, and we need to add a transition rule δ:(q1, 1)→q3
      Path: →(q0)─1→(q1)─1→(q3)
    4. Four: in binary '100'; the end-state is q4. The TD already processes prefix string '10' and we just need to add a new transition rule δ:(q2, 0)→q4
      Path: →(q0)─1→(q1)─0→(q2)─0→(q4)

    fig-2 Figure-2

    Step-3: Five = 101
    The transition diagram in figure-2 is still incomplete and there are many missing edges, for example, no transition is defined for δ:(q2, 1)-?. And it needs the rules to process strings like '101'.
    Because '101' = 5 is divisible by 5, and to accept '101' I will add δ:(q2, 1)→q0 in above figure-2.
    Path: →(q0)─1→(q1)─0→(q2)─1→(q0)
    with this new rule, the transition diagram becomes:

    fig-3 Figure-3

    In each step below I pick the next subsequent integer to add a missing edge until I get a TD that is a 'complete DFA'.

    Step-4: Six = 110.

    We can already process '11' in the TD in figure-3 as: →(q0)─11→(q3) ─0→(?). Because 6 % 5 = 1 this means we need to add one rule δ:(q3, 0)→q1.

    fig-4 Figure-4

    Step-5: Seven = 111

    Number Binary Remainder(%5) End-state Path Add
    Seven 111 7 % 5 = 2 q2 q0─11→q3 q3─1→q2

    fig-5 Figure-5

    Step-6: Eight = 1000

    Number Binary Remainder(%5) End-state Path Add
    Eight 1000 8 % 5 = 3 q3 q0─100→q4 q4─0→q3

    fig-6 Figure-6

    Step-7: Nine = 1001

    Number Binary Remainder(%5) End-state Path Add
    Nine 1001 9 % 5 = 4 q4 q0─100→q4 q4─1→q4

    fig-7 Figure-7

    In TD-7, the total number of edges is 10 == Q × Σ = 5 × 2. And it is a complete DFA that can accept all possible binary strings whose numeric equivalent is divisible by 5.

    Design a DFA accepting ternary integers divisible by n:

    We take again as example n = 5.

    Step-1 Exactly the same as for binary, use figure-1.

    Step-2 Add Zero, One, Two

    Number Ternary Remainder(%5) End-state Add
    Zero 0 0 q0 δ:(q0,0)→q0
    One 1 1 q1 δ:(q0,1)→q1
    Two 2 2 q2 δ:(q0,2)→q3

    fig-8
    Figure-8

    Step-3 Add Three, Four, Five

    Number Ternary Remainder(%5) End-state Add
    Three 10 3 q3 δ:(q1,0)→q3
    Four 11 4 q4 δ:(q1,1)→q4
    Five 12 0 q0 δ:(q1,2)→q0

    fig-9
    Figure-9

    Step-4 Add Six, Seven, Eight

    Number Ternary Remainder(%5) End-state Add
    Six 20 1 q1 δ:(q2,0)→q1
    Seven 21 2 q2 δ:(q2,1)→q2
    Eight 22 3 q3 δ:(q2,2)→q3

    fig-10
    Figure-10

    Step-5 Add Nine, Ten, Eleven

    Number Ternary Remainder(%5) End-state Add
    Nine 100 4 q4 δ:(q3,0)→q4
    Ten 101 0 q0 δ:(q3,1)→q0
    Eleven 102 1 q1 δ:(q3,2)→q1

    fig-11
    Figure-11

    Step-6 Add Twelve, Thirteen, Fourteen

    Number Ternary Remainder(%5) End-state Add
    Twelve 110 2 q2 δ:(q4,0)→q2
    Thirteen 111 3 q3 δ:(q4,1)→q3
    Fourteen 112 4 q4 δ:(q4,2)→q4

    fig-12
    Figure-12

    The total number of edges in the transition diagram of figure-12 is 15 = Q × Σ = 5 * 3 (a complete DFA). And this DFA can accept all strings that consist of ternary digits, whose numeric equivalent is divisible by 5.
    Notice that at each step, the table has three entries because at each step we need all possible outgoing edges from a state to make a complete DFA!

    To add further, we can use the fact that the union of two regular languages is also regular. If you need to design a DFA that accepts binary strings representing integers that are either divisible by 3 or 5, then design two separate DFAs, one for "divisible by 3" and one for "divisible by 5" and then take the union of both DFAs to construct the target DFA (for 1 <= n <= 10 you'd have to take the union of 10 DFAs).

    If we need a DFA that accepts binary strings such that the represented integer is divisible by both 5 and 3, then this comes down to designing a DFA that accepts integers that are divisible by 15 (but what about 6 and 8?).

    Note: DFAs drawn with this technique will be minimized DFA only when n and the base don't share a common factor. In the first example above, 5 and 2 don't share a common factor. In the second example 5 and 3 don't share a common factor either, hence both DFAs constructed above are minimized DFAs. For those interested to read further about possible mini states for number n and base b, read paper: Divisibility and State Complexity.

    Design a DFA for base 'b' encoded integers divisible by 'n':

    So we can apply the above technique to design a DFA to recognize integers encoded in any base 'b' that are divisible by a given integer 'n'. In such a DFA the total number of states will be n (for n remainders) and the number of edges should be equal to 'b' * 'n' — it is a complete DFA: 'b' = the number of symbols in the language of the DFA and 'n' = the number of states.

    Using the above technique, I wrote a Python script that uses the above technique to design a DFA for input base and number. In the script, the function divided_by_N populates the DFA's transition rules in base * number steps. In each step-num, I convert num into a string num_s using the function baseN(). To avoid processing each string, I have used a temporary data-structure lookup_table. In each step, the end-state for the string num_s is evaluated and stored in lookup_table for use in a next step.

    For the transition graph of DFA, I have written a function draw_transition_graph using the Pygraphviz library (very easy to use). To use this script you need to install graphviz. To add colorful edges in transition diagram, I randomly generates color codes for each symbol get_color_dict function.

    #!/usr/bin/env python
    import pygraphviz as pgv
    from pprint import pprint
    from random import choice as rchoice
    
    def baseN(n, b, syms="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"):
        """ converts a number `n` into base `b` string """
        return ((n == 0) and syms[0]) or (
            baseN(n//b, b, syms).lstrip(syms[0]) + syms[n % b])
    
    def divided_by_N(number, base):
        """
        constructs DFA that accepts given `base` number strings
        those are divisible by a given `number`
        """
        ACCEPTING_STATE = START_STATE = '0'
        SYMBOL_0 = '0'
        dfa = {
            str(from_state): {
                str(symbol): 'to_state' for symbol in range(base)
            }
            for from_state in range(number)
        }
        dfa[START_STATE][SYMBOL_0] = ACCEPTING_STATE
        # `lookup_table` keeps track: 'number string' -->[dfa]--> 'end_state'
        lookup_table = { SYMBOL_0: ACCEPTING_STATE }.setdefault
        for num in range(number * base):
            end_state = str(num % number)
            num_s = baseN(num, base)
            before_end_state = lookup_table(num_s[:-1], START_STATE)
            dfa[before_end_state][num_s[-1]] = end_state
            lookup_table(num_s, end_state)
        return dfa
    
    def symcolrhexcodes(symbols):
        """
        returns dict of color codes mapped with alphabets symbol in symbols
        """
        return {
            symbol: '#'+''.join([
                rchoice("8A6C2B590D1F4E37") for _ in "FFFFFF"
            ])
            for symbol in symbols
        }
    
    def draw_transition_graph(dfa, filename="filename"):
        ACCEPTING_STATE = START_STATE = '0'
        colors = symcolrhexcodes(dfa[START_STATE].keys())
        # draw transition graph
        tg = pgv.AGraph(strict=False, directed=True, decorate=True)
        for from_state in dfa:
            for symbol, to_state in dfa[from_state].iteritems():
                tg.add_edge("Q%s"%from_state, "Q%s"%to_state,
                            label=symbol, color=colors[symbol],
                            fontcolor=colors[symbol])
        
        # add intial edge from an invisible node!
        tg.add_node('null', shape='plaintext', label='start')
        tg.add_edge('null', "Q%s"%START_STATE,)
        
        # make end acception state as 'doublecircle'
        tg.get_node("Q%s"%ACCEPTING_STATE).attr['shape'] = 'doublecircle'
        tg.draw(filename, prog='circo')
        tg.close()
    
    def print_transition_table(dfa):
        print("DFA accepting number string in base '%(base)s' "
                "those are divisible by '%(number)s':" % {
                    'base': len(dfa['0']),
                    'number': len(dfa),})
        pprint(dfa)
    
    if __name__ == "__main__":
        number = input ("Enter NUMBER: ")
        base = input ("Enter BASE of number system: ")
        dfa = divided_by_N(number, base)
        
        print_transition_table(dfa)
        draw_transition_graph(dfa)
    

    Execute it:

    ~/study/divide-5/script$ python script.py 
    Enter NUMBER: 5
    Enter BASE of number system: 4
    DFA accepting number string in base '4' those are divisible by '5':
    {'0': {'0': '0', '1': '1', '2': '2', '3': '3'},
     '1': {'0': '4', '1': '0', '2': '1', '3': '2'},
     '2': {'0': '3', '1': '4', '2': '0', '3': '1'},
     '3': {'0': '2', '1': '3', '2': '4', '3': '0'},
     '4': {'0': '1', '1': '2', '2': '3', '3': '4'}}
    ~/study/divide-5/script$ ls
    script.py filename.png
    ~/study/divide-5/script$ display filename
    

    Output:

    base_4_divided_5_best
    DFA accepting integers encoded in base 4 that are divisible by 5

    Similarly, enter base = 4 and number = 7 to generate - dfa accepting number string in base '4' those are divisible by '7'
    Btw, try changing filename to .png or .jpeg.

    References

    References that I used to write this script: