Pattern-Based DFA for Memory-Efficient and Scalable Multiple Regular Expression Matching
In Network Intrusion Detection System, Deterministic Finite Automaton (DFA) is widely used to compare packet content at a constant speed against a set of patterns specified in regular expressions (regexes). However, combining many regex patterns into a single DFA causes a serious state explosion. Partitioning the pattern set into several subsets, each of which produces a small DFA, is a practical way to deflate the state explosion. In this paper, the authors propose a regex pattern grouping scheme based on a new DFA model called Pattern-Based DFA (P-DFA) which supports efficient pattern-based operations, such as insertion, deletion, etc. By using these basic operations, one can easily measure the state explosion when combining a set of regex patterns into a single DFA.