Finite Automata
The field of automata theory models computation to understand its limits.
Finite Automata Introduction
Finite automata are abstract mathematical machines used to recognize patterns in input sequences. They serve as the core state-tracking mechanisms that power Lexical Analysis tools during compilation.
Formal Definition
A finite automaton is formally defined as a 5-tuple: { Q, Σ, q0, F, δ }
- Q: Finite set of all states
- Σ: Set of input symbols (Alphabet)
- q0: Initial state (Starting state)
- F: Set of final (accepting) states
- δ: Transition function (δ : Q × Σ → Q)
Deterministic (DFA) vs Non-Deterministic (NFA)
| Feature | Deterministic Finite Automata (DFA) | Non-Deterministic Finite Automata (NFA) |
|---|---|---|
| Transitions | Exactly one transition per input symbol from each state | Multiple transitions per input symbol allowed |
| Null Moves | No ε (null) moves allowed | Allows ε (null) moves (changing state without input) |
| Design | Simpler execution, but sometimes larger and harder to design | More flexible, intuitive, and easier to design |
| Next State | Uniquely determined | May have multiple possibilities |
| Equivalence | Recognizes Regular Languages | Recognizes Regular Languages (Can be converted to DFA) |
Minimization of DFA
DFA minimization reduces the number of states in a DFA while keeping it equivalent (accepting the exact same language). By minimizing the state count, it drastically reduces the memory footprint required to store the transition table.
We use the Partitioning Algorithm (Equivalence Theorem):
- Initial Partition (P0): Divide all states
Qinto two sets:- Final states (
F) - Non-final states (
Q - F)
- Final states (
- Refining Partitions (Pk): For each set in
Pk-1, check if states within the same set are “distinguishable”. Two states are distinguishable if, for any input symbola, their transitions lead to different sets inPk-1. If they are distinguishable, split them. - Termination: Repeat until
Pk = Pk-1(no further splits can be made). - Merge: Merge all states that remain in the same set into a single state.
Operations on DFA
DFAs support various mathematical operations that help build more complex language recognizers from simpler ones.
- Union (L₁ ∪ L₂): Combines two DFAs so the new DFA accepts all words that either of the original DFAs would accept.
- Concatenation (L₁ ∘ L₂): Creates a new DFA that accepts words formed by taking a word from L₁ followed by a word from L₂.
- Reversal (Lᴿ): A new DFA that accepts the reversed versions of words accepted by the original DFA. This is achieved by reversing transition edges and swapping initial/final states. (Note: The result is generally an NFA since reversing edges can introduce non-determinism, and it must be re-determinized using the subset construction algorithm).
- Complementation (L̅): A new DFA that accepts all words not accepted by the original DFA. This is achieved simply by swapping final and non-final states.
- Intersection (L₁ ∩ L₂): A new DFA that accepts only words that are present in both original DFAs.
- Difference (L₁ - L₂): Accepts strings that are in DFA 1 but not in DFA 2.
NFA to DFA Conversion
Every NFA can be converted to an equivalent DFA using the Subset Construction Algorithm. An NFA might have multiple transitions for the same symbol, but a DFA groups these multiple possible states into single sets.
Steps for Conversion:
- Epsilon Closure: For the start state
q0of the NFA, find all states reachable using onlyε(null) transitions. This set becomes the new start state of the DFA. - Transition Mapping: For every newly created DFA state (which is a set of NFA states), compute the transitions for each input symbol.
δ'(Set, a) = Union of δ(q, a) for all q in Set- Include the ε-closures of the resulting states.
- Repeat: Create new states in the DFA transition table until no new sets of states are generated.
- Final States: Any DFA state that contains at least one final state of the original NFA becomes a final state in the DFA.
Problems on Finite Automata
Finite automata can be designed to solve very specific pattern-matching problems. Common design problems include:
- Substring Constraints: DFA for strings not ending with a specific pattern (e.g., “THE”).
- Counting Constraints: DFA of a string with at least two 0’s and at least two 1’s.
- Parity/Even-Odd Logic: DFA for accepting the language
L = { a^n b^m | n+m = even }. - Multiple Parity Conditions: Machines accepting an odd number of 0’s and an even number of 1’s.
- Positional Constraints: DFA of a string in which the 2nd symbol from the RHS is ‘a’.
Tip: When solving these problems, it is usually easiest to first build a simple NFA using ε-transitions, and then convert and minimize it into a DFA.