In automata theory, we can use Myhill–Nerode theorem to prove that a language is not regular. The theorem makes use of the concept of distinguishable and non-distinguishable strings.
While it is intuitively simple to identify the difference between two strings like '101' and '1011', I am having a hard time with formal definition of indistinguishable strings as described in the book Introduction to Automata Theory by W. Goddard:
2 strings x and y are indistinguishable with respect to L if for every string z, it holds that xz ∈ L if and only if yz ∈ L. Otherwise they are distinguishable.
However if I construct a DFA with alphabet = {0, 1} accepting language
containing strings that all being with 11, I get confused as condition
still holds even if x and y are different. For example,
· x = 110
· y = 1101
· z = 01 (But could be any (every) string regardless since upon the
first 2 input symbols being 1 DFA always accepts).
We have yz ∈ L and hence xz ∈ L but clearly x ≠ y.
I am sure I am getting confused in the if and only if portion of the condition but I can't quite fully grasp it. How can we formally expand on this as well as provided a simple example (or this case) providing a contradiction?
The if and only if portion does not suggest that if yz and xz are both in L then x must be equal to y.
It says that for every suffix z (z can also be the empty string) if you can find x and y such that:
then x and y are distinguishable.
To visualize this you can think about all the strings that start from an initial state and reach a particular state of the DFA. If you then forget the string used to get to that state and try to "turn around" to go back you will find that you can't exactly recreate the starting string in most cases.
In the figure when you get to the accepting state you cannot distinguish between the strings 0110 and 0111110.
An example of distinguishable strings is:
L = { s | s contains exactly 3 zeros }
· x = 010101 (in L)
· y = 0101 (out L)
but for z = 01 (you need only one counter example for x and y)
· xz = 01010101 (out L)
· yz = 010101 (in L)
then x and y are distinguishable wrt L
For indistinguishable strings:
L = { s | s starts with 3 zeros }
· x = 00011111 (in L)
· y = 00011 (in L)
for any z
· xz = 00011111z (in L)
· yz = 00011z (in L)
then z does not matter, any two strings that start with 000 are indistinguishable wrt L