Input validation with cfform controls

The cfinput and cftextinput tags include the validate attributes, which lets you specify a valid data entry type for the control. You can validate user entries on the following data types:

Data type

Description

Date

Verifies US date entry in the form mm/dd/yyyy (where the year can have one through four digits).

Eurodate

Verifies valid European date entry in the form dd/mm/yyyy (where the year can have one through four digits).

Time

Verifies a time entry in the form hh:mm:ss.

Float

Verifies a floating point entry.

Integer

Verifies an integer entry.

Telephone

Verifies a telephone entry. You must enter telephone data as ###-###-####. You can replace the hyphen separator (-) with a blank. The area code and exchange must begin with a digit between 1 and 9.

Zipcode

(U.S. formats only) Number can be a five-digit or nine-digit zip in the form #####-####. You can replace the hyphen separator (-) with a blank.

Creditcard

Blanks and dashes are stripped and the number is verified using the mod10 algorithm.

Social_security_number

You must enter the number as ###-##-####. You can replace the hyphen separator (-) with a blank.

Regular_expression

Matches the input against a JavaScript regular expression pattern. You must use the pattern attribute to specify the regular expression. Any entry containing characters that matches the pattern is valid.

Data type	Description
Date	Verifies US date entry in the form mm/dd/yyyy (where the year can have one through four digits).
Eurodate	Verifies valid European date entry in the form dd/mm/yyyy (where the year can have one through four digits).
Time	Verifies a time entry in the form hh:mm:ss.
Float	Verifies a floating point entry.
Integer	Verifies an integer entry.
Telephone	Verifies a telephone entry. You must enter telephone data as ###-###-####. You can replace the hyphen separator (-) with a blank. The area code and exchange must begin with a digit between 1 and 9.
Zipcode	(U.S. formats only) Number can be a five-digit or nine-digit zip in the form #####-####. You can replace the hyphen separator (-) with a blank.
Creditcard	Blanks and dashes are stripped and the number is verified using the mod10 algorithm.
Social_security_number	You must enter the number as ###-##-####. You can replace the hyphen separator (-) with a blank.
Regular_expression	Matches the input against a JavaScript regular expression pattern. You must use the `pattern` attribute to specify the regular expression. Any entry containing characters that matches the pattern is valid.

When you specify an input type in the validate attribute, ColdFusion tests for the specified input type when you submit the form, and submits form data only on a successful match. A successful form submission returns the value True and returns the value False if validation fails.

Validating with regular expressions

You can use regular expressions to match and validate the text that users enter in cfinput and cftextinput tags. Ordinary characters are combined with special characters to define the match pattern. The validation succeeds only if the user input matches the pattern.

Regular expressions allow you to check input text for a wide variety of conditions. For example, if a date field must only contain dates between 1950 and 2050, you can create a regular expression that matches only numbers in that range. You can concatenate simple regular expressions into complex search criteria to validate against complex patterns, such as any of several words with different endings.

You can use ColdFusion variables and functions in regular expressions. The ColdFusion Server evaluates the variables and functions before the regular expression is evaluated. For example, you can validate against a value that you generate dynamically from other input data or database values.

Note: The rules listed in this section are for JavaScript regular expressions, and apply to the regular expressions used in cfinput and cftextinput tags only. These rules differ from those used by the ColdFusion functions REFind, REReplace, REFindNoCase, and REReplaceNoCase. For information on regular expressions used in ColdFusion functions, see Chapter 7, "Using Regular Expressions in Functions".

Special characters

Because special characters are the operators in regular expressions, in order to represent a special character as an ordinary one, you must precede it with a backslash. For example, use double backslash characters (\\) to represent a backslash character.

Single-character regular expressions

The following rules govern regular expressions that match a single character:

Special characters are: + * ? . [ ^ $ ( ) { | \
Any character that is not a special character or escaped by being preceded by the backslash (\) matches itself.
A backslash (\) followed by any special character matches the literal character itself, that is, the backslash escapes the special character.
A period (.) matches any character except newline.
A set of characters enclosed in brackets ([]) is a one-character regular expression that matches any of the characters in that set. For example, "[akm]" matches an "a", "k", or "m". If you include ] (closing square bracket) in square brackets, it must be the first character. Otherwise, it does not work, even if you use \].
A dash can indicate a range of characters. For example, "[a-z]" matches any lowercase letter.
If the first character of a set of characters in bracket is the caret (^), the expression matches any character except those in the set. It does not match the empty string. For example: [^akm] matches any character except "a", "k", or "m". The caret loses its special meaning if it is not the first character of the set.
You can make regular expressions case insensitive by substituting individual characters with character sets, for example, [Nn][Ii][Cc][Kk].

You can use the following escape sequences to match specific characters or character classes:


Escape seq	Matches	Escape seq	Meaning
[\b]	Backspace	\s	Any of the following white space characters: space, tab, form feed, and line feed.
\b	A word boundary such as a space	\S	Any character except the white space characters matched by \s
\B	A non-word boundary	\t	Tab
\cX	The control character Ctrl-x. For example, \cv matches Ctrl-v, the usual control character for pasting text.	\v	Vertical tab
\d	A digit character [0-9]	\w	An alphanumeric character or underscore. The equivalent of [A-Za-z0-9_]
\D	Any character except a digit	\W	Any character not matched by \w. The equivalent of [^A-Za-z0-9_]
\f	Form feed	\n	Backreference to the nth expression in parentheses. See "Backreferences"
\n	Line feed	\ooctal	The character represented in the ASII character table by the specified octal number
\r	Carriage return	\xhex	The character represented in the ASCII character table by the specified hexadecimal number

Multicharacter regular expressions

Use the following rules to build a multicharacter regular expression:

Parentheses group parts of regular expressions together into a subexpression that can be treated as a single unit. For example, (ha)+ matches one or more instances of "ha".
A one-character regular expression or grouped subexpression followed by an asterisk (*) matches zero or more occurrences of the regular expression. For example, [a-z]* matches zero or more lowercase characters.
A one-character regular expression or grouped subexpression followed by a plus (+) matches one or more occurrences of the regular expression. For example, [a-z]+ matches one or more lowercase characters.
A one-character regular expression or grouped subexpression followed by a question mark (?) matches zero or one occurrences of the regular expression. For example, xy?z matches either "xyz" or "xz".
The carat (^) at the beginning of a regular expression matches the beginning of the field.
The dollar sign ($) at the end of a regular expression matches the end of the field.
The concatenation of regular expressions creates a regular expression that matches the corresponding concatenation of strings. For example, [A-Z][a-z]* matches any capitalized word.
The OR character (|) allows a choice between two regular expressions. For example, jell(y|ies) matches either "jelly" or "jellies".
Braces ({}) are used to indicate a range of occurrences of a regular expression, in the form {m, n} where m is a positive integer equal to or greater than zero indicating the start of the range and n is equal to or greater than m, indicating the end of the range. For example, (ba){0,3} matches up to three pairs of the expression "ba". The form {m,} requires at least m occurrences of the preceding regular expression. The form {m} requires exactly m occurrences of the preceding regular expression. The syntax {,n} is not allowed.

Backreferences

Backreferencing lets you match text in previously matched sets of parentheses. A slash followed by a digit n (\n) refers to the nth parenthesized subexpression.

One example of how you can use backreferencing is searching for doubled words; for example, to find instances of 'the the' or 'is is' in text. The following example shows the syntax you use for backreferencing in regular expressions:

(\b[A-Za-z]+)[ ]+\1

This code matches text that contains a word (specified by the \b word boundary special character and the [A-Za-z]+) followed by one or more spaces [ ]+, followed by the first matched subexpression in parentheses. For example, it would match "is is, or "This is is", but not "This is".

Exact and partial matches

Entered data is normally valid if any of it matches the regular expression pattern. Often you might ensure that the entire entry matches the pattern. If so, you must "anchor" it to the beginning and end of the field as follows:

If a caret (^) is at the beginning of a pattern, the field must begin with a string that matches the pattern.
If a dollar sign ($) is at the end of pattern, the field must end with a string that matches the pattern.
If the expression starts with a caret and ends with a dollar sign, the field must exactly match the pattern.

Expression examples

The following examples show some regular expressions and describe what they match:

Expression

Description

[\?&]value=

Any string containing a URL parameter value.

^[A-Z]:(\\[A-Z0-9_]+)+$

An uppercase DOS/Windows directory path that is not the root of a drive and has only letters, numbers, and underscores in its text.

^(\+|-)?[1-9][0-9]*$

An integer that does not begin with a zero and has an optional sign.

^(\+|-)?[1-9][0-9]*(\.[0-9]*)?$

A real number.

^(\+|-)?[1-9]\.[0-9]*E(\+|-)?[0-9]+$

A real number in engineering notation.

a{2,4}

A string containing two to four occurrences of 'a': aa, aaa, aaaa; for example aardvark, but not automatic.

(ba){2,}

A string containing least two 'ba' pairs; for example Ali baba, but not Ali Baba.

Expression	Description
[\?&]value=	Any string containing a URL parameter value.
^[A-Z]:(\\[A-Z0-9_]+)+$	An uppercase DOS/Windows directory path that is not the root of a drive and has only letters, numbers, and underscores in its text.
^(\+\|-)?[1-9][0-9]*$	An integer that does not begin with a zero and has an optional sign.
^(\+\|-)?[1-9][0-9](\.[0-9])?$	A real number.
^(\+\|-)?[1-9]\.[0-9]*E(\+\|-)?[0-9]+$	A real number in engineering notation.
a{2,4}	A string containing two to four occurrences of 'a': aa, aaa, aaaa; for example aardvark, but not automatic.
(ba){2,}	A string containing least two 'ba' pairs; for example Ali baba, but not Ali Baba.

Resources

An excellent reference on regular expressions is Mastering Regular Expressions by Jeffrey E.F. Friedl, published by O'Reilly & Associates, Inc.

Developing ColdFusion MX Applications with CFML
Building Dynamic Forms