Regex Guru

Friday, 19 December 2008

Don’t Escape Literal Characters That Aren’t Metacharacters

Filed under: Regex Trouble — Jan Goyvaerts @ 17:29

Perl-style regular expressions treat 12 punctuation characters as metacharacters outside character classes. These characters need to be escaped with a backslash if you want to include them as literal characters in your regex:
.^$|*+?()[{\
Inside character classes, these flavors treat a different set of 4 punctuation characters as metacharacters. Only those 4 need to be [...]

Thursday, 18 December 2008

TPerlRegEx Now with Proper UTF-8 (Unicode) Support

Filed under: Regex Libraries — Jan Goyvaerts @ 16:36

TPerlRegEx for Delphi 2009 had a rather embarrasing bug: it didn’t actually enable the UTF-8 support in PCRE if you did not set the Options property to something different than the default.

Wednesday, 3 December 2008

More Time for Blogging

Filed under: About Regex Guru — Jan Goyvaerts @ 15:44

Since I’m done writing my part of a new book about regular expressions, I should have more time for blogging.

Sunday, 2 November 2008

Detecting URLs in a Block of Text

Filed under: Regex Examples — Jan Goyvaerts @ 7:57

In his blog post The Problem with URLs, Jeff Atwood points out some of the issues with trying to detect URLs in a larger body of text using a regular expression.
The short answer is that it can’t be done. Pretty much any character is valid in URLs. The very simplistic \bhttp://\S+ not only [...]

Wednesday, 8 October 2008

R

Filed under: Regex Libraries — Jan Goyvaerts @ 17:09

The R language is now covered on regular-expressions.info, and supported by RegexBuddy.

Tuesday, 7 October 2008

Windows PowerShell

Filed under: Regex Libraries — Jan Goyvaerts @ 16:57

Windows PowerShell is now covered on regular-expressions.info, and supported by RegexBuddy.

Tuesday, 19 August 2008

TPerlRegEx for Delphi 2009

Filed under: Regex Libraries — Jan Goyvaerts @ 12:12

TPerlRegEx is a Delphi VCL component wrapper around the open source PCRE library. I originally developed it for in-house use. It powered EditPad Pro 4 and 5, PowerGREP 1 and 2, and RegexBuddy 1. The latest versions of these products use a custom-built regular expression engine. The custom-built engine can do [...]

Sunday, 29 June 2008

Jeff Atwood on Regular Expressions

Filed under: Links — Jan Goyvaerts @ 15:31

Last Friday Jeff Atwood makes a case for judicious use of regular expressions in the article Regular Expressions: Now You Have Two Problems on his Coding Horror blog.
Nitpick: In free-spacing mode (RegexOptions.IgnoreWhitespace in .NET), the # starts a comment all by itself, which runs to the end of the line. # comment is three [...]

Tuesday, 27 May 2008

Writing Offline

Filed under: About Regex Guru — Jan Goyvaerts @ 12:56

I’m co-writing a book on regular expressions. This blog will likely be quiet until we’re done writing the book.

Thursday, 8 May 2008

Follow Up with Adequate Testing

Filed under: Regex Trouble — Jan Goyvaerts @ 15:05

The regular expression from the Do Follow plugin is dedicated to a single purpose. Repurposing it for your own code will expose shortcomings that don’t matter for the plugin, but may matter for what you’re trying to do. Never copy-and-paste a regex without testing it.

No Follow The Lazy Dot

Filed under: Regex Trouble — Jan Goyvaerts @ 8:31

The popular Do Follow WordPress plugin uses a rather inefficient regular expression for its job. Here’s how to improve it.

Wednesday, 23 April 2008

PCRE Library for MySQL

Filed under: Regex Libraries — Jan Goyvaerts @ 11:24

A RegexBuddy user pointed me to LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library.
MySQL’s built-in regular expression support uses the POSIX ERE flavor. By todays standards, that flavor offers limited regex functionality. PCRE on the other hand offers all the goodies from Perl and [...]

Next Page »