Perl-style regular expressions treat 12 punctuation characters as metacharacters outside character classes. These characters need to be escaped with a backslash if you want to include them as literal characters in your regex: .^$|*+?()[{\ Inside character classes, these flavors treat a different set of 4 punctuation characters as metacharacters. Only those 4 need to be […]
Read full article Comments Off on Don’t Escape Literal Characters That Aren’t Metacharacters
TPerlRegEx for Delphi 2009 had a rather embarrasing bug: it didn’t actually enable the UTF-8 support in PCRE if you did not set the Options property to something different than the default.
Since I’m done writing my part of a new book about regular expressions, I should have more time for blogging.
In his blog post The Problem with URLs, Jeff Atwood points out some of the issues with trying to detect URLs in a larger body of text using a regular expression. The short answer is that it can’t be done. Pretty much any character is valid in URLs. The very simplistic \bhttp://\S+ not only fails […]
The R language is now covered on regular-expressions.info, and supported by RegexBuddy.
Windows PowerShell is now covered on regular-expressions.info, and supported by RegexBuddy.
TPerlRegEx is a Delphi VCL component wrapper around the open source PCRE library. I originally developed it for in-house use. It powered EditPad Pro 4 and 5, PowerGREP 1 and 2, and RegexBuddy 1. The latest versions of these products use a custom-built regular expression engine. The custom-built engine can do things such as searching […]
Last Friday Jeff Atwood makes a case for judicious use of regular expressions in the article Regular Expressions: Now You Have Two Problems on his Coding Horror blog. Nitpick: In free-spacing mode (RegexOptions.IgnoreWhitespace in .NET), the # starts a comment all by itself, which runs to the end of the line. # comment is three […]
I’m co-writing a book on regular expressions. This blog will likely be quiet until we’re done writing the book.
The regular expression from the Do Follow plugin is dedicated to a single purpose. Repurposing it for your own code will expose shortcomings that don’t matter for the plugin, but may matter for what you’re trying to do. Never copy-and-paste a regex without testing it.
The popular Do Follow WordPress plugin uses a rather inefficient regular expression for its job. Here’s how to improve it.
A RegexBuddy user pointed me to LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. MySQL’s built-in regular expression support uses the POSIX ERE flavor. By todays standards, that flavor offers limited regex functionality. PCRE on the other hand offers all the goodies from Perl and other modern […]