# DFA minimization

For any given DFA $A$, we can build a DFA $A’$ that accepts the same language using the minimum number of states needed. Since $A$ has more states than $A’$, we need to figure out which states we have to remove.

The states getting removed are the ones which already have an equivalent state in diagram.

## Equivalent states

Two states $q$ and $r$ are equivalent iff they both end in an accepting state or they both end in a non-accepting state, on any input word:

- $\hat{\delta}(q, w) \rightarrow^* f_1$ and $\hat{\delta}(r, w) \rightarrow^* f_2$ where $f_1, f_2 \in F$; OR:
- $\hat{\delta}(q, w) \rightarrow^* s_1$ and $\hat{\delta}(r, w) \rightarrow^* s_2$ where $s_1, s_2 \notin F$
- $\forall w \in \Sigma^*$

Note: $\hat{\delta}$ is the extension of $\delta$ which works on words of length > 1.

## Indistinguishability table

We will mark all pairs of states which are *indistinguishable* — they can’t be told apart. A dot · is placed where two states *may* be indistinguishable.

0 | 1 | 2 | 3 | 4 | 5 | 6 | |
---|---|---|---|---|---|---|---|

0 |
|||||||

1 |
· | ||||||

2 |
· | · | |||||

3 |
· | · | · | ||||

4 |
· | · | · | · | |||

5 |
· | · | · | · | · |

Note: we compute only the lower-left of the table, as it is symmetrical ($q$ is as distinguishable from $r$ as $r$ is distinguishable from $q$) and the main diagonal will always be un-checked ($q$ is always indistinguishable from $q$, itself).

Start by marking ✓ all pairs $(q, r)$ where $q \in F$ and $s \notin F$. Any accepting state is obviously distinguishable from a non-accepting state.

0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|

1 |
· | ||||

2 |
✓ | ✓ | |||

3 |
✓ | ✓ | · | ||

4 |
✓ | ✓ | · | · | |

5 |
· | · | ✓ | ✓ | ✓ |

Mark ✓ any an unmarked · pair $(q, r)$, for which $(\delta(p, x), \ \delta(r, x))$ is marked. Where $x$ is any symbol (`a`

or `b`

in our case). Repeat until no further change occurs in the table.

0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|

1 |
· | ||||

2 |
✓ | ✓ | |||

3 |
✓ | ✓ | · | ||

4 |
✓ | ✓ | · | · | |

5 |
✓ | ✓ | ✓ | ✓ | ✓ |

For pair $(0, 1)$:

- symbol
`a`

: $\delta(0, a)=1$; $\delta(1, a)=0$ — we can’t distinguish 1 from 0 - symbol
`b`

: $0 \rightarrow 2$; $1 \rightarrow 3$ — we can’t distinguish 2 from 3

For pair $(2, 3)$:

- symbol
`a`

: $2 \rightarrow 4$; $3 \rightarrow 4$ — we can’t distinguish 4 from 4 - symbol
`b`

: $2 \rightarrow 5$; $3 \rightarrow 5$ — we can’t distinguish 5 from 5

For pair $(0, 5)$:

- symbol
`a`

: $0 \rightarrow 1$; $5 \rightarrow 5$ — we can distinguish 1 from 5 ✓ - symbol
`b`

: $0 \rightarrow 2$; $5 \rightarrow 5$ — we can distinguish 2 from 5 ✓

Pair $(1, 5)$ can be distinguished ✓ because $\delta(1, a)=0$; $\delta(5, a)=5$ and we can distinguish 0 from 5. Pair $(3, 4)$ can not be distinguished.

## Combining states

After building the table, we found two classes of indistinguishablity:

- $\lbrace 0, 1 \rbrace$ — combine them into state $q_{01}$
- $\lbrace 2, 3, 4 \rbrace$ — combine them into state $q_{234}$
- $\lbrace 5 \rbrace$ — just itself, $q_5$

Minimized automaton table:

$\delta’$ | a | b |
---|---|---|

$\bf{q_{01}}$ | $q_{01}$ | $q_{234}$ |

$\bf{q_{234}}$ | $q_{234}$ | $q_5$ |

$\bf{q_5}$ | $q_5$ | $q_5$ |

Steps:

- $\delta(q_0, a)=q_1$ and $\delta(q_1, a)=q_0$ so $\delta’(q_{01}, a) = q_{01}$
- $\delta(q_0, b)=q_2$ and $\delta(q_1, b)=q_3$ so $\delta’(q_{01}, a) = q_{234}$, the state that contains $q_2$ and $q_3$
- same for $q_{234}$ and $q_5$

## Starting and accepting states

The starting state remains $q_0$.

A state in $A’$ is accepting iff its *component* states are accepting in $A$. In our case, $q_{234}$.

## Dead-end states

A state is a dead-end if it is not final and there exists no path from it to a final state. It behaves like the junk state so it can be safely removed.

In our case, $q_5$ is not final and once we reach it, any transition keeps us in the same state — we can’t ever go to a final state. It is a dead-end state.

In the original automaton if we extend the lower-right part, $q_5$ like this:

$q_7$ is a dead-end as it has no out-ward transitions. $q_6$ is a dead-end as well since it can only go to $q_7$ which we just marked as a dead-end state.

## Unreachable states

A state $s$ is unreachable if there exists no path from $q_0$ to $s$. It can be safely removed.

The original automaton had no unreachable states. If we extend the lower-left part, $q_1$ like this:

$q_9$ is an unreachable state because it has no in-ward transitions. $q_8$ is as well since only $q_9$ can reach it.

## Converted automaton

After removing extra states and renaming the ones left, we obtain a much simpler, equivalent, automaton:

## IO

Read the definition of a DFA and print the definition of the equivalent minimal DFA.

## Input `DFA.definition`

```
2 3 4
0 a 1
0 b 2
1 a 0
1 b 3
2 a 4
2 b 5
3 a 4
3 b 5
4 a 4
4 b 5
5 a 5
5 b 5
```

Accepting states on the first line; entries in the transition table on subsequent lines.

## Example output `DFA-min.definition`

```
1
0 a 0
0 b 1
1 a 1
```

The minimal DFA is unique, modulo state names.

To verify equivalence we would have to check the output of both atomata on every possible input. Since this is an infinite set, we will settle with just a small relevant subset.

## Input `DFA.tests`

```
b
ab
ba
aba
aaabaaa
a
aa
bb
aabb
aabbbaa
```

```
1
1
1
1
1
0
0
0
0
0
```

Words to test on the converted automaton.

## Extended example

Extra dead-end and unreachable states.

## Input `DFA.definition`

```
2 3 4
0 a 1
0 b 2
1 a 0
1 b 3
2 a 4
2 b 5
3 a 4
3 b 5
4 a 4
4 b 5
5 a 5
5 b 6
6 b 7
8 a 1
9 a 8
9 b 8
```

Same `DFA-min.definition`

and input words.

## More examples

Narrated minimizations: