Regex Guru

Friday, 19 December 2008

Don’t Escape Literal Characters That Aren’t Metacharacters

Filed under: Regex Trouble — Jan Goyvaerts @ 17:29

Perl-style regular expressions treat 12 punctuation characters as metacharacters outside character classes. These characters need to be escaped with a backslash if you want to include them as literal characters in your regex: .^$|*+?()[{\ Inside character classes, these flavors treat a different set of 4 punctuation characters as metacharacters. Only those 4 need to be […]

Thursday, 18 December 2008

TPerlRegEx Now with Proper UTF-8 (Unicode) Support

Filed under: Regex Libraries — Jan Goyvaerts @ 16:36

TPerlRegEx for Delphi 2009 had a rather embarrasing bug: it didn’t actually enable the UTF-8 support in PCRE if you did not set the Options property to something different than the default.

Wednesday, 3 December 2008

More Time for Blogging

Filed under: About Regex Guru — Jan Goyvaerts @ 15:44

Since I’m done writing my part of a new book about regular expressions, I should have more time for blogging.

Sunday, 2 November 2008

Detecting URLs in a Block of Text

Filed under: Regex Examples — Jan Goyvaerts @ 7:57

In his blog post The Problem with URLs, Jeff Atwood points out some of the issues with trying to detect URLs in a larger body of text using a regular expression. The short answer is that it can’t be done. Pretty much any character is valid in URLs. The very simplistic \bhttp://\S+ not only fails […]

Wednesday, 8 October 2008

R

Filed under: Regex Libraries — Jan Goyvaerts @ 17:09

The R language is now covered on regular-expressions.info, and supported by RegexBuddy.

Tuesday, 7 October 2008

Windows PowerShell

Filed under: Regex Libraries — Jan Goyvaerts @ 16:57

Windows PowerShell is now covered on regular-expressions.info, and supported by RegexBuddy.

Tuesday, 19 August 2008

TPerlRegEx for Delphi 2009

Filed under: Regex Libraries — Jan Goyvaerts @ 12:12

TPerlRegEx is a Delphi VCL component wrapper around the open source PCRE library. I originally developed it for in-house use. It powered EditPad Pro 4 and 5, PowerGREP 1 and 2, and RegexBuddy 1. The latest versions of these products use a custom-built regular expression engine. The custom-built engine can do things such as searching […]

Sunday, 29 June 2008

Jeff Atwood on Regular Expressions

Filed under: Links — Jan Goyvaerts @ 15:31

Last Friday Jeff Atwood makes a case for judicious use of regular expressions in the article Regular Expressions: Now You Have Two Problems on his Coding Horror blog. Nitpick: In free-spacing mode (RegexOptions.IgnoreWhitespace in .NET), the # starts a comment all by itself, which runs to the end of the line. # comment is three […]

Tuesday, 27 May 2008

Writing Offline

Filed under: About Regex Guru — Jan Goyvaerts @ 12:56

I’m co-writing a book on regular expressions. This blog will likely be quiet until we’re done writing the book.

Thursday, 8 May 2008

Follow Up with Adequate Testing

Filed under: Regex Trouble — Jan Goyvaerts @ 15:05

The regular expression from the Do Follow plugin is dedicated to a single purpose. Repurposing it for your own code will expose shortcomings that don’t matter for the plugin, but may matter for what you’re trying to do. Never copy-and-paste a regex without testing it.

No Follow The Lazy Dot

Filed under: Regex Trouble — Jan Goyvaerts @ 8:31

The popular Do Follow WordPress plugin uses a rather inefficient regular expression for its job. Here’s how to improve it.

Wednesday, 23 April 2008

PCRE Library for MySQL

Filed under: Regex Libraries — Jan Goyvaerts @ 11:24

A RegexBuddy user pointed me to LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. MySQL’s built-in regular expression support uses the POSIX ERE flavor. By todays standards, that flavor offers limited regex functionality. PCRE on the other hand offers all the goodies from Perl and other modern […]

Next Page »