Regex Guru

Thursday, 3 April 2008

wxRegEx class in wxWidgets

Filed under: Regex Code — Jan Goyvaerts @ 16:24

wxWidgets is a popular open source cross-platform windowing toolkit for C++ and other programming languages. Included with this toolkit is the wxRegEx class. This class encapsulates the “Advanced Regular Expressions” engine that was originally developed for Tcl. This means that anything you read about Tcl’s regular expression flavor also applies to wxRegEx. Since wxRegEx is compiled from the actual ARE source code, there are no compatibility issues. In RegexBuddy, simply select the Tcl ARE flavor to create patterns for wxRegEx. The only caveat is that you need to specify the wxRE_ADVANCED flag to wxRegEx.Compile(), or you’ll be stuck with plain old POSIX EREs.

I’ve been putting this class through its paces for a few days. I’ve written some documentation for wxRegEx that’s a bit more detailed than the official docs. The class is fairly bare-bones. You can compile a regex, find the first match in a string, and search-and-replace any number of matches in the string. That’s it. RegexBuddy 3.1.1, released today, includes a new source code template for wxRegEx. It generates source code snippets for the basic wxRegEx tasks I just mentioned. I also put in some more elaborate code snippets to iterate over all matches in a string, and to split a string into a wxArrayString.

You can do anything with wxRegEx that you could do in a programming language with built-in regex support. But it generally takes a bit more C++ code to get the job done. If you’ve already written your own support routines based on wxRegEx, you can easily edit RegexBuddy’s source code templates for wxWidgets to use your own routines. Just click the Edit button on the toolbar under the Use tab.

Friday, 21 March 2008

preg_replace_callback

Filed under: Regex Code — Jan Goyvaerts @ 10:48

I just added a paragraph about preg_replace_callback to the PHP reference on regular-expressions.info. This function is just like preg_replace, with one important difference: instead of passing the replacement as a literal string (or array of strings), you pass it the name of a function. This function will be called for each match. In the function, you can do whatever calculations you want to produce the replacement text.

Guess what the following code does:

$result = preg_replace_callback('/(\d+)\+(\d+)/', compute_replacement, $subject);

function compute_replacement($groups) {
  // You can vary the replacement text for each match on-the-fly
  // $groups[0] holds the regex match
  // $groups[n] holds the match for capturing group n
  return $groups[1] + $groups[2];
}

A few other programming languages have similar functionality. E.g. in .NET, you’d pass a MatchEvaluator instance to the Regex.Replace() method. RegexBuddy can already generate such code snippets for .NET and Java. The PHP version will be added in the next free minor update.