There are a few tools out there to help you write and understand regular expressions, including some IDEs that can provide assistance (like Komodo).
Sometimes you just want a quick explanation of a regular expression you might have seen in some code. The YAPE::Regex::Explain perl module is one tool that helps you do just that. Turning the module into a simple command-line tool is a simple one-liner:
#!/usr/bin/perl -wI've also turned it into a simple CGI utility for those times when my regex memory fails me. You can use it here: regexplainr (sorry, you may find that site offline but here's the source code).
print YAPE::Regex::Explain->new($ARGV[0])->explain;
In a previous post, On Parsing CSV and other Delimited/Quoted Formats, I used the following regular expression to parse a whitespace-delimited string:
"([^"]+?)"\s?|([^\s]+)\s?|\sRegexplainr produces the following commentary:
The regular expression:(Permalink to this regexplanation)
(?-imsx:"([^"]+?)"\s?|([^\s]+)\s?|\s)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
[^"]+? any character except: '"' (1 or more
times (matching the least amount
possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
[^\s]+ any character except: whitespace (\n,
\r, \t, \f, and " ") (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
\s? whitespace (\n, \r, \t, \f, and " ")
(optional (matching the most amount
possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
Book tip: (thanks to Tony) O'Reilly's Mastering Regular Expressions. Available on google books, and also from Amazon.
8 comments:
Thanks for the tips, excellent stuff in there. The book: Mastering Regular Expressions, Third Edition
By Jeffrey E. F. Friedl, is one of the best technical books I've ever read and it taught me heaps about regex.
Yes Tony, yet another great O'Reilly book. Totally agree with the recommendation!
Nice tip, however may i ask if formatting can be restored in the regexplainer output ? That would make it even more uable.
@anonymous: hmm, maybe. What's the formatting issue you have? Can you post an example?
Hey Paul,
Actually when i ran a sample regex, everything appeared on one line, as it newlines were eaten (alive).
Hence i thought maybe i'll let you know, I used IE, can't test FF from work.
R
Raj, I think I got it - you have a regex expressed over multiple lines, and want that preserved? You are correct - at the moment the regexplainer I posted assumes the input regex is all in one line. I'll put it down as a little project for the weekend;-)
Thanks Paul ... I already got it bookmarked. Very handy tool.
Thanks Raj.
btw, that weekend project is still pending. But then I didn't say which weekend;-)
Post a Comment