22.2 Access Classes in RulesClass macros are useful only in the LHS of rules. The sendmail program offers two ways to use them:
22.2.1 Matching Any in a Class: $=The list of words that form a class are searched by prefixing the class name with the characters $=: R$=X $@<$1> In this rule, the expression $=X causes sendmail to search a class for the word that is in the current workspace. If sendmail finds that the word has been defined, and if it finds that the word is associated with the class $=X, only then is a match made. The matching word is made available for use in the RHS rewriting. Because the value of $=X is not known ahead of time, the matched word can be referenced in the RHS with the $digit positional operator. Consider the following example. Two classes have been declared elsewhere in the configuration file. The first, $=w, contains all the possible names for the local host: Cw localhost mailhost server1 server2 The second, $=D, contains the domain names of the two different networks on which this host sits: CD internal.domain external.domain If the object of a rule is to match any variation on the local hostname at either of the domains and to rewrite the result as the official hostname at the appropriate domain, the following rule can be used: R $=w . $=D $@ $w . $2 make any variations "official" If the workspace contains the tokenized address server1.external.domain, sendmail first checks to see whether the word server1 has been defined as part of the class w. If it has, the dot in the rule and workspace match each other, and then sendmail looks up external.domain. If both the host part and the domain part are found to be members of their respective classes, the RHS of the rule is called to rewrite the workspace. The $2 in the workspace corresponds to the $=D in the LHS. The $=D matches the external.domain from the workspace, so that text is used to rewrite the new workspace. Note that prior to V8, when sendmail looked up the workspace to check for a match to a class, it looked up only a single token. V8 sendmail allows multitoken class matching. 22.2.2 Matching Any Token Not in a Class: $~The $~ prefix is used to match any single token in the workspace that is not in a class. It is used fewer than a dozen times in a typical production configuration file, but when the need for its properties arises, it can be very useful. To illustrate, consider a network with three PC machines on it. The PC machines cannot receive mail, whereas all the other machines on the network can. If the list of PC hostnames is defined in the class {PChosts}: C{PChosts} pc1 pc2 pc3 a rule can be designed that will match any but a PC hostname: R $* < @ $~{PChosts} > $@ $1 < @ $2 > filter out the PC hosts Here the LHS looks for an address of the form: "user" "<" "@" "not-a-PC" ""> This matches only if the @ token is not followed by one of the PC hosts listed in class $={PChosts}. If the part of the workspace that is tested against the list provided by $~ is found in that list, the match fails. Note that the $digit positional operator in the RHS (the $2 in the preceding example) references the part that matches $~{PChosts}. That is, $2 references the token in the workspace that is not in the class {PChosts}. If the workspace contains ben<@philly>, the $2 references the philly. Also note that multitoken expressions in the workspace will not match. That is, for multitoken expressions in the workspace, $~ is not the opposite of $=. To illustrate, consider this miniconfiguration file: V10 CX hostA.com Stest R $~X $@ no $1 is not in X R $=X $@ yes $1 is in X R $* $@ neither Now feed a multitokened address through these rules in rule-testing mode: % /usr/sbin/sendmail -Cx.cf -bt ADDRESS TEST MODE (ruleset 3 NOT automatically invoked) Enter <ruleset> <address> > test hostC.com test input: hostC . com test returns: neither Here, the rule set returned neither because a multitoken expression in the workspace should never be used with $~. That is, $~ looks for a workspace that is not a member of the class and, indeed, hostC.com is not. But because hostC.com is multi-tokened, $~ acts as though it is a member of the class, and so does not call the RHS of the rule: R $~X a multi-tokened workspace will never call the RHS If you consider multitokens and $~ as illegal to use together, this failure, although convoluted, makes sense. Another way to think of this failure is by comparing the $~ operator to the $- operator. Neither will match more than a single token in the workspace. If the $~ does not match a single token, the LHS does not match, and the RHS is not called. There are two ways to circumvent this problem. One alternative is to make the $~ always look up only a single token: R $~X $* $@ no $1 is not in X Here, the $* will match the .com. Then $~X will correctly look up only the single token hostC, and correctly not find it. A second alternative is to invert the logic of the test, and use the $= prefix only when multiple tokens are in the workspace: R $=X $@ yes $1 is in X R $* $@ no $1 is not in X Here, we first check to see if the multitokened workspace is in the class $=X, and return yes if it is. Otherwise, we know it is not in the class. 22.2.3 Backup and RetryMultitoken matching operators, such as $+, always try to match the least that they can (Section 18.6.2). Such a simple-minded approach could lead to problems in matching (or not matching) classes in the LHS. However, the ability of sendmail to backup and retry alleviates this problem. For example, consider the following five tokens in the workspace: "A" "." "B" "." "C" and consider the following LHS rule: R $+ . $=X $* Because the $+ tries to match the minimum, it first matches only the A in the workspace. The $=X then tries to match the B. and then B.C to the class $=X. If this match fails, sendmail backs up to the $+ and tries again. The next time through, the $+ matches A. in the workspace, but that fails to match the dot in the rule, so it backs up again and matches A.B. The $=X tries to match the C in the workspace. If C is not in the class $=X, the entire LHS fails. The ability of the sendmail program to back up and retry LHS matches eliminates much of the ambiguity from rule design. The multitoken matching operators try to match the minimum but match more if necessary for the whole LHS to match. 22.2.4 Class Name Hashing AlgorithmWhen comparing a token in the workspace to a list of words in a class array, sendmail tries to be as efficient as possible. Instead of comparing the token to each word in the list, one by one, it simply looks up the token in its internal string pool. If the token is in the pool and if the pool listing is marked as belonging to the class being sought, a match is found. The comparison of tokens to entries in the string pool is case-insensitive. Each token is converted to lowercase before the comparison, and all strings in the string pool are stored in lowercase. Because strings are stored in the pool as text with a type, the same string value can be used for different types with no conflict. For example, the symbolic name of a delivery agent and a word as a class macro's value can be identical, yet they will still be separate entries in the string pool. The sendmail program uses a simple hashing algorithm to ensure that the token is compared to the fewest possible strings in the string pool. In normal circumstances that algorithm performs its job well. At sites with unusually large classes (perhaps a few thousand hosts in a class of host aliases), it might be necessary to tune the hashing algorithm. The code is in the file stab.c with the sendmail source. The number of hash buckets is set by the constant STABSIZE. As an alternative to very full classes, sendmail offers database maps (Section 23.1). No information is currently available contrasting the efficiency of the various approaches. |