Regular expressions were introduced in Operations. Each regular expression is declared with a unique name using the pattern keyword. Here is the declaration from query_in_words.taq
A hash '#' character is used as a regular expression operator and is followed by the name of a pattern. Here is the match operation which applies the "in_words" pattern to a template term named "word"
Before the match operation takes place, "word" is loaded with an "i" word dictionary definition. If there is no match, then execution short circuits instead of continuing on to produce a solution.
TAQ regular expressions are implemented using the Java Pattern class. Regular expression features are:
- Can be applied as either a term or criterion expression
- Announced with'distinctive #' symbol
- Options are available , like case-insensitive match
- A cursor can provide input
- Group variables are configurable (type, default)
Criterion
When placed in a criterion, a regular expression acts like a boolean expression and supports grouping. group_in_words.taq demonstrates a criterion using a regular expression with grouping. The pattern to match on is named "defPattern" and uses grouping to extract the meaning of a word from a dictionary entry
definition #defPattern ( def )
Here is the program showing it has another regular expression to select only words beginning with "in"
pattern in_word "^in[^ ]+"
pattern defPattern "^[nvaj.]+ (.*+)"
flow in_words
{
string definition
(
term definition,
word # in_word,
? definition #defPattern ( def )
Cursor Input
A cursor can be used as an input to a regular expression. As a cursor is used inside a loop, it is important that it increments or decrements when providing input, otherwise an infinite loop is caused by repeatedly getting a pattern mismatch on the same value.
regex-pet-names.taq has a "pets" query which uses a cursor named "pet" to iterate through a list of pet details in XML format, extract the name of each pet and export the names in a list. A second "regex-reverse_pets" query operates the cursor in reverse. These are the regular expressions for the forward and reverse traversals respectively
? pet-- #petName ( name )
Here is the pattern definition and flows
string petPattern = "^.*" + namePattern +".*"
pattern petName petPattern
flow pets
{ export list<string> pet_names }
(
{
? pet++ #petName ( name ),
pet_names += name
flow reverse_pets
{ export list<string> pet_names }
(
{
? pet-- #petName ( name ),
pet_names += name
Options
A pattern declaration can include options. They are appended to a pattern declaration identifier enclosed in parentheses and comma-delimited . The options are the static boolean flags of the Java Pattern class, but written in lower case. For example,. "case_insensitive" is the option for Pattern.CASE_INSENSITIVE.