Compiles and matches regular-expression patterns.
Programmers Workbench Library (libPW.a)
#include <libgen.h>
char *regcmp (String [, String, . . . ], (char *) 0) const char *String, . . . ;
const char *regex (Pattern, Subject [, ret, . . . ]) char *Pattern, *Subject, *ret, . . . ; extern char *__loc1;
The regcmp subroutine compiles a regular expression (or Pattern) and returns a pointer to the compiled form. The regcmp subroutine allows multiple String parameters. If more than one String parameter is given, then the regcmp subroutine treats them as if they were concatenated together. It returns a null pointer if it encounters an incorrect parameter.
You can use the regcmp command to compile regular expressions into your C program, frequently eliminating the need to call the regcmp subroutine at run time.
The regex subroutine compares a compiled Pattern to the Subject string. Additional parameters are used to receive values. Upon successful completion, the regex subroutine returns a pointer to the next unmatched character. If the regex subroutine fails, a null pointer is returned. A global character pointer, __loc1, points to where the match began.
The regcmp and regex subroutines are borrowed from the ed command; however, the syntax and semantics have been changed slightly. You can use the following symbols with the regcmp and regex subroutines:
All of the preceding defined symbols are special. You must precede them with a \ (backslash) if you want to match the special symbol itself. For example, \$ matches a dollar sign.
The regcmp subroutine produces code values that the regex subroutine can interpret as the regular expression. For instance, [a-z] indicates a range expression which the regcmp subroutine compiles into a string containing the two end points (a and z).
The regex subroutine interprets the range statement according to the current collating sequence. The expression [a-z] can be equivalent either to [abcd . . . xyz] , or to [aBbCcDd . . . xXyYzZ], as long as the character preceding the minus sign has a lower collating value than the character following the minus sign.
The behavior of a range expression is dependent on the collation sequence. If you want to match a specific set of characters, you should list each one. For example, to select letters a, b, or c, use [abc] rather than [a-c] .
Notes:
- No assumptions are made at compile time about the actual characters contained in the range.
- Do not use multibyte characters.
- You can use the ] (right bracket) itself within a pair of brackets if it immediately follows the leading [ (left bracket) or [^ (a left bracket followed immediately by a circumflex).
- You can also use the minus sign (or hyphen) if it is the first or last character in the expression. For example, the expression [ ] -0] matches either the right bracket ( ] ), or the characters - through 0.
A common use of the range expression is matching a character class. For example, [0-9] represents all digits, and [a-z, A-Z] represents all letters. This form may produce unexpected results when ranges are interpreted according to the current collating sequence.
Instead of the range expression shown above, use a character class expression within brackets to match characters. The system interprets this type of expression according to the current character class definition. However, you cannot use character class expressions in range expressions.
The following exemplifies the syntax of a character class expression:
[:charclass:]
that is, a left bracket followed by a colon, followed by the name of the character class, followed by another colon and a right bracket.
National Language Support supports the following character classes:
These subroutines are part of Base Operating System (BOS) Runtime.
The ctype subroutine, compile, step, or advance subroutine, malloc, free, realloc, calloc, mallopt, mallinfo, or alloca subroutine, regcomp, regex subroutine.
The ed command, regcmp command.
Subroutines Overview in AIX Version 4.3 General Programming Concepts: Writing and Debugging Programs.