This section describes the syntax that should be used to construct regular expressions for nete:rule elements. A nete:xprcond element takes the following form:
<nete:xprcond>
<nete:xpr>
<nete:rule>regular_expression</nete:rule> <nete:result>result</nete:result>
</nete:xpr> <nete:xpr-default>forward_destination</nete:xpr-default>
</nete:xprcond>
In the nete:xpr element, the nete:rule element must consist of a regular expression that uses the syntax described in the following table. This syntax is consistent with the regular expression syntax supported by Apache and described at http://www.apache.org.
|
Characters |
Results |
|---|---|
|
unicode character |
Matches any identical unicode character |
|
\ |
Used to quote a meta-character like '*') |
|
\\ |
Matches a single ’\’ character |
|
\0nnn |
Matches a given octal character |
|
\xhh |
Matches a given 8-bit hexadecimal character |
|
\\uhhhh |
Matches a given 16-bit hexadecimal character |
|
\t |
Matches an ASCII tab character |
|
\n |
Matches an ASCII newline character |
|
\r |
Matches an ASCII return character |
|
\f |
Matches an ASCII form feed character |
|
[abc] |
Simple character class |
|
[a-zA-Z] |
Character class with ranges |
|
[^abc] |
Negated character class |
|
[:alnum:] |
Alphanumeric characters |
|
[:alpha:] |
Alphabetic characters |
|
[:blank:] |
Space and tab characters |
|
[:cntrl:] |
Control characters |
|
[:digit:] |
Numeric characters |
|
[:graph:] |
Characters that are printable and are also visible (A space is printable, but not visible, while an ‘a’ is both) |
|
[:lower:] |
Lower-case alphabetic characters |
|
[:print:] |
Printable characters (characters that are not control characters) |
|
[:punct:] |
Punctuation characters (characters that are not letter, digits, control characters, or space characters) |
|
[:space:] |
Space characters (such as space, tab, and formfeed) |
|
[:upper:] |
Upper-case alphabetic characters |
|
[:xdigit:] |
Characters that are hexadecimal digits |
|
[:javastart:] |
Start of a Java identifier |
|
[:javapart:] |
Part of a Java identifier |
|
. |
Matches any character other than newline |
|
\w |
Matches a "word" character (alphanumeric plus "_") |
|
\W |
Matches a non-word character |
|
\s |
Matches a whitespace character |
|
\S |
Matches a non-whitespace character |
|
\d |
Matches a digit character |
|
\D |
Matches a non-digit character |
|
^ |
Matches only at the beginning of a line |
|
$ |
Matches only at the end of a line |
|
\b |
Matches only at a word boundary |
|
\B |
Matches only at a non-word boundary |
|
A* |
Matches A 0 or more times (greedy) |
|
A+ |
Matches A 1 or more times (greedy) |
|
A? |
Matches A 1 or 0 times (greedy) |
|
A{n} |
Matches A exactly n times (greedy) |
|
A{n,} |
Matches A at least n times (greedy) |
|
A{n,m} |
Matches A at least n but not more than m times (greedy) |
|
A*? |
Matches A 0 or more times (reluctant) |
|
A+? |
Matches A 1 or more times (reluctant) |
|
A?? |
Matches A 0 or 1 times (reluctant) |
|
AB |
Matches A followed by B |
|
A|B |
Matches either A or B |
|
(A) |
Used for subexpression grouping |
|
\1 |
Backreference to 1st parenthesized subexpression |
|
\n |
Backreference to nth parenthesized subexpression |
All closure operators (+, *, ?, {m,n}) are greedy by default, meaning that they match as many elements of the string as possible without causing the overall match to fail. If you want a closure to be reluctant (non-greedy), you can simply follow it with a ’?’. A reluctant closure will match as few elements of the string as possible when finding matches. {m,n} closures don’t currently support reluctancy.
| Copyright © 2012 CA. All rights reserved. |
|