What do they look like ?

In their simplest form a Regular Expression is simply a description of the pattern we are trying to match. This pattern is placed inside a pair of /'s, for example in the replacement example on the previous page, our Regular Expression could be :

replace(/2002/,"2003")

However, Regular Expressions offer us shortcuts to represent specific groupings of character, for example :

Regular Expression Description
\d Matches any digit - 0,1,2,3,4,5,6,7,8,9.
\w Matches and letter (either case), number of underscore.
\s Matches any of the whitespace characters such as space, tab, new line etc.
. Matches any character except a new line.

In addition to these groupings we can negate them by adding a preceeding ^ symbol so ^\d means anything that is not a digit. In some cases we may be looking to match a set number of occurances of a pattern, for example for the phone code used in the first example we need to match 5 digits, so we could use :

/\d\d\d\d\d/

But this can very quickly become complex for large or complex patterns, we may also not know exactly how many digits we need to match, for example to validate a percentage value we could have a value of 6, 17 or even 100 percent. This means we are looking for a pattern which has one, two or even three digits, so a straight collection of \d's won't work. Luckily Regular Expressions provide repetition operators that we can use to overcome this problem :

Regular Expression Description
{x} Matches x occurances of the pattern.
{x,y} Matches at least x and not more than y occurance of the pattern.
{x,} Matches at least x occurances of the pattern, with no upper limit.
? Matches none or one occurance of the pattern.
+ Matches one or more occurances of the pattern.
* Matches none of more occurances of the pattern.

Thus our Regular Expression of percentages becomes :

/\d{1,3}/

If we are looking for a code as our pattern where the first three characters is a combination of the following letters a, b, c, x, y or z followed by three digits. The last section is doable as we can use \d{3}, but we can't use \w as we don't want a d or j in there. Regular Expressions also allows us to specify a custom range between square brackets - [ and ] so we can do the following :

/[a,b,c,x,y,z]{3}\d{3}/

We can use a range operator to simplify the previous pattern, so we could have used :

/[a-c,x-z]{3}\d{3}/

There are a few other groupings which can be used to ensure we get the pattern we are looking for and only the pattern we are looking for, the common ones are :

Regular Expression Description
/^ Start of the string or line.
$/ End of the string or line.
/g Go though the entire string, don't stop on the fist match
\<special character> Matches a special character such as newline (\n), tab (\t) or a character that is part of the syntax (\{ looks for a { in the pattern).
\i Case insensitive.
\b Start or end of a word, allows you to only match on complete words such as cat and not cattle.

So we can now use these to produce complex Regular Expressions, for example :

Regular Expression Description
/^\d{3}[P]$/ the string must be 3 digits then capital P.
/^\d{3}[P]$/i the string must be 3 digits, then a P in either case.
/Java/gi Matches every occurance of Java in the string in any format of case.
/Java\b/gi Matches every occurance of Java in the string in any format of case, but will not match Javascript or Java's etc.

What happens if our pattern can begin with either XY or AB followed by three digits, or we want to match win or wind but not windows ?

Well, in the first case we can use the OR operator to search for XY or AB, followed by a standard \d{3} to find the rest, in the second one we can use the ? operator on the d so that we match zero or one - although we could equally use {0,1}. We can also use brackets to seperate out the sections or our Regular Expression. Therefore our Regular Expressions are :

/^(([X][Y])|([A][B]))\d{3}$/

and

/^win(d)?$/

Now that you have some idea of how RegEx's are created, its time to test your knowledge in the next section to see how much you have learnt.
Website Designed by Adservio Consulting Valid HTML 4.01 Strict    Valid CSS!    Level A conformance icon, W3C-WAI Web Content Accessibility Guidelines 1.0
o