Regular expressions are a lot of black magic to people unfamiliar with them, but if you can learn regular expressions, a lot of time and effort can be saved, and some very cool things can be done.
So what is a regular expression? A regular expression (also written regex or regexp, for short) is a very short and flexible means of matching strings in text. This can be accomplished by matching groups of characters, patterns, or other characteristics of text. Why would a regular expression be useful? Well, you can use regexp's for all kinds of things, and many programs like grep or vi understand regexps, which means they are useful for searching text and manipulating text.
For instance, if you were trying to hunt down any references to a particular RAID array in the system logs, you could do:
$ grep md /var/log/messages*
And manually filter out what you want, or run the grep command multiple times (i.e., if you were interested in two problematic arrays, md1 and md3, you could do this with two commands). You could use regexps with grep to do it as well:
$ egrep 'md[1,3]' /var/log/messages*
egrep is synonymous with grep -e; the -e command tells grep that the search criteria is a regexp to use. This regexp is very simplistic: md[1,3] indicates to match on md1 and md3; this will exclude md0, md2, and others.
Learning regular expressions can be difficult, however. And not all regular expressions are created equal: POSIX regexps, for example, differ slightly from Perl-compatible regexps. If you are using the Komodo IDE, it comes with a fantastic regular expression toolkit. If you do not use the commercial Komodo IDE, however, there are other options.
One such option is Kodos. Kodos is a Python program using PyQt to provide a GUI interface. This regexp program is not as feature-rich as the Komodo IDE, but is free and open source and does a great job of teaching and creating regular expressions.The program interface is split into three panes: the regular expression to test; the search (or replacement) string to apply the regular expression to; and the results pane, which shows what the regexp matches in the string, what would be replaced, and which groups match. For instance, given the string python-1.2.3-4.fc10.2 and the regular expression (\w+)-([0-9.]+)-[0-9]+\.fc([0-9]+)\.?.* the match window would highlight the string. The regular expression does indeed match the provided string, as shown in Figure A. In the group window, you'll see three matching groups (groups are specified using the parentheses): group one is python, group two is 1.2.3 and group three is 10.
Regexp replacement uses numbers to match a group, so to use the regular expression to pull the python string, you would use \1 as the replacement value. By switching to the replace string window and entering \1, and switching the bottom window to show the replace field, you will see what the results of the replacement pattern are.
Finally, if you are using the regular expressions in python, Kodos will helpfully provide sample code you can use in a Python program to use that regular expression and get the results you are looking for.
While Kodos is a Python program and is a little more geared towards Python programmers, it will work well for anyone interested in learning regexps. It also provides a regexp library that includes some user-contributed regexps, as well as a regexp reference manual.
For a free program, it's hard to beat Kodos. And even if you are familiar with regexps but need a tool to debug more complicated regular expressions, Kodos makes a great addition to your development toolkit.
Get the PDF version of this tip here.
Delivered each Tuesday, TechRepublic's free Linux and Open Source newsletter provides tips, articles, and other resources to help you hone your Linux skills. Automatically sign up today!
Vincent Danen works on the Red Hat Security Response Team and lives in Canada. He has been writing about and developing on Linux for over 10 years and is a veteran Mac user.