CARIS HIPS and SIPS Help : Support Files Guide : Filters : Comparison Operators : Regular Expressions
 

Regular Expressions

Several mapping operators can use regular expressions to search and in some cases replace text. Regular expressions (“Regex”) are based on the Perl 5.8 language. The following operators can make use of regex:

Object

AttributeAcronymIsLike

AttributeValueIsLike

ObjectAcronymIsLike

ReplaceAttributeValue

This is an example of a regular expression used in conjunction with one of these tags:

<AttributeValueIsLike Acronym="TXTDSC" Value="[A-Z0-9]{2}[A-Z0-9_]{6}\.TXT" MatchCase="True" />

Note that the “$” character is a special reserved metacharacter in regular expressions and must be escaped using a “\” as in the following example:

<Object Acronym="\$bycol">

For more information about the Perl regular expression syntax, see http://perldoc.perl.org/perlre.html

These examples of regular expressions could be used to match attribute values:

Purpose

Regular Expression

Match any string

<AttributeValueIsLike Acronym="ABC" Value=".*" />

Match a string in the middle of a value

<AttributeValueIsLike Acronym="ABC" Value=".\bstring_to_match\b." />

For example, startstring_to_matchend - will match

Match a string at the beginning of a value

<AttributeValueIsLike Acronym="ABC" Value="string_to_match\b.*" />

For example, startstring_to_matchend will not match but string_to_matchend will match.

Match a string at the end of a value

<AttributeValueIsLike Acronym="ABC" Value=".*\bstring_to_match" />

For example, startstring_to_matchend will not match but startstring_to_match will match.

Match either upper or lower case

<AttributeValueIsLike Acronym="ABC" Value="[Ss]kipper" />

This will match Skipper and skipper.

Match a sequence of numeric values - either 4 or 8 numbers

<AttributeValueIsLike Acronym="DATEND" Value="[0-9]{4,8}"/>

This will match 1234 or 87654321

Match two upper case characters

<AttributeValueIsLike Acronym="NATION" Value="[A-Z]{2}" MatchCase="True"/>

This will match CA, but not Ca or ca.