dfanfa

How to build DFA with subset construction algorithm when NFA contain `.` condition


such as .*ab

For this regex, I can build an NFA using the Thompson algorithm like:

NFA for .*ab

When I try to use the subset construction algorithm to change it to a DFA, I find the result is weird.

DFA for .*ab

the DFA graph:

DFA for .*ab

I try to use it to match aaab, but faild

When the DFA is in state "A" and reads a character a, I don't know which is the next state.

If I choose "B" as the next state, it means that <any> condition has a higher priority; thus, the DFA will circulate in state "B"

If I choose "C" as the next state, it means a condition has a higher priorty and this DFA can not match aaaab

I have check the Compilers: Principles,Techniques,and Tools, but the book do not talk about how to handle the .

I believe I built a wrong DFA (or NFA), but I can't find where I went wrong.


Solution

  • The NFA is correct, but the construction table you have created has the following issues:

    Here is the corrected table:

    NFA state DFA state <other-than-a-b> a b
    {1,2,4,5} A B C B
    {2,3,4,5} B B C B
    {2,3,4,5,6,7} C B C D
    {2,3,4,5,8} D B C B

    You can now make the DFA diagram from this:

    DFA

    Here the labels "not a" are short for "b or <other-than-a-b>".

    This diagram can still be simplified: the states A and B can be merged.