Regex Guru

Tuesday, 11 March 2008

No Spaces or Dashes

Filed under: Regex Philosophy — Jan Goyvaerts @ 10:01

Programmers are lazy. That’s why we slave away hours on end to program our computers to automate tedious tasks for ourselves and others. That’s the right kind of laziness.

But far too often, programmers show the wrong kind of laziness. Instead of spending a bit more effort on their product, they push the responsibility to the end user.

How many times have you seen an order form that says “no spaces or dashes” next to the field for the credit card number? It always makes me wonder how seriously such companies take themselves. I mean, if credit card companies design their cards to use spaces to group the embossed digits, there’s probably a reason to it. Like making it easy for humans to keep track of how many digits they’ve already entered.

Compare these four lines of code, in Perl, PHP, JavaScript and HTML, respectively:

$cc =~ s/\D+//g;
$cc = preg_replace('/\D+/', '', $cc);
cc = cc.replace(/\D+/g, '');
<p>No spaces or dashes</p>

The first three lines let the computer strip out all non-digits. The HTML version tells the user to do it. Yet, the dumb HTML version takes about as many webmaster keystrokes as smart versions.

Regular expressions make it very easy to fix up user input. If an underlying library or remote service has strict input requirements, don’t force those requirements onto the user. Many credit card processing software has been around since the days CPU time was at a premium. So it’s likely that it doesn’t do fancy stuff like stripping out spaces and dashes. So use that little regex to bend the card number to the processor’s requirements, instead of making the user bend to it.

I’m sure if you spend half an hour going over the forms on your web site or the dialog boxes in your software, you can find many places where you could replace error messages with simple regexes that fix up the input.


  1. […] you’ve graciously stripped out spaces and dashes, you can check whether the card number looks like a valid number and even determine the brand of […]

    Pingback by Regex Guru » Validating Credit Card Numbers — Thursday, 13 March 2008 @ 8:33

  2. Yes, silly formatting rules like “no spaces or dashes” are a pet peeve of mine as well, but your \D+ regex obviously works a little differently than just allowing spaces and dashes. I think there’s a fine line between flexibility and outright allowing stupidity (which can have serious backward compatibility and interoperability implications… look at IE, for example). It might make sense to go with a stricter regex such as [- ]+.

    Comment by Steven Levithan — Friday, 14 March 2008 @ 4:52

  3. But it saves two keystokes, Steven! :-)

    Comment by Jan — Friday, 14 March 2008 @ 15:04

  4. Along this same line, here are two other recurring pet peeves of mine that exploitation of regexes could certainly alleviate:

    1. Why am I forced to choose a state abbreviation from a list of allowed state names? Why not let me enter my state (N.Y.” or “NY” or “New York” or “NEW YORK” etc.) and if the code doesn’t understand it, then it can prompt for a more precise state id
    2. Why am I forced to enter a date in the format used by the application? The same thoughts prevail: let me enter a date, and if the code doesn’t undertstand it, then it can prompt for clarification.

    (I once taught an undergraduate C Language class at a university, and I gave my students the problem of writing a C function that would take any string of characters and try to determine the date it represented. Some of those kids did a superb job – I threw every kind of trick I could think of at their code and they came up with correct dates. All that without regexes – think what they could have done *with* regexes!)

    Comment by Dave Jenkins — Thursday, 3 April 2008 @ 19:54

Sorry, the comment form is closed at this time.