From owner-spamtools@lists.abuse.net Sat Sep 13 08:14:28 2003 Return-Path: Delivered-To: ea_plusn-earlesfield-p-filtered@earlesfield.plus.com Received: (qmail 17191 invoked from network); 13 Sep 2003 08:14:28 -0000 Received: from warrior.services.quay.plus.net (212.159.14.227) by netmail00.services.quay.plus.net with SMTP; 13 Sep 2003 08:14:28 -0000 Received: (qmail 16856 invoked from network); 13 Sep 2003 08:14:31 -0000 Received: from gatekeeper.city-fan.org (212.56.100.58) by warrior.services.quay.plus.net with SMTP; 13 Sep 2003 08:14:30 -0000 X-SQ: A Received: from gatekeeper.city-fan.org (paul@localhost.intra.city-fan.org [127.0.0.1]) by gatekeeper.city-fan.org (8.12.9/8.12.9) with ESMTP id h8D8EPER006140 for ; Sat, 13 Sep 2003 09:14:25 +0100 Received: (from paul@localhost) by gatekeeper.city-fan.org (8.12.9/8.12.9/Submit) id h8D8EPx1006138 for p-filtered@earlesfield.plus.com; Sat, 13 Sep 2003 09:14:25 +0100 Received: from xuxa.iecc.com (xuxa.iecc.com [208.31.42.42]) by gatekeeper.city-fan.org (8.12.9/8.12.9) with SMTP id h8D8DwER006131 for ; Sat, 13 Sep 2003 09:14:04 +0100 Received: (qmail 23513 invoked by uid 85); 13 Sep 2003 08:13:21 -0000 Received: (qmail 23129 invoked from network); 13 Sep 2003 08:12:35 -0000 Received: from serv5.gtcs.com (HELO mail.gtcs.com) (209.181.16.5) by mail2.iecc.com with SMTP; 13 Sep 2003 08:12:35 -0000 Received: from home.gtcs.com (home [209.181.16.2]) by mail.gtcs.com (8.11.3/gtcs-5.7.9) with ESMTP id h8D8CQg55456 (using TLSv1/SSLv3 with cipher EDH-RSA-DES-CBC3-SHA (168 bits) verified NO) for ; Sat, 13 Sep 2003 02:12:31 -0600 (MDT) (envelope-from: ) Content-class: urn:content-classes:message X-Authentication-Warning: serv.gtcs.com: Host home [209.181.16.2] claimed to be home.gtcs.com Received: from localhost (localhost [127.0.0.1]) by home.gtcs.com (8.11.3/8.11.3/wkstn-1.02) with ESMTP id h8D8CPM45248 for ; Sat, 13 Sep 2003 02:12:25 -0600 (MDT) (envelope-from antispam@gtcs.com) Date: Sat, 13 Sep 2003 02:12:25 -0600 (MDT) Message-Id: <200309130812.h8D8CPM45248.smij@home.gtcs.com> From: Bruce Gingery To: SpamTools Subject: [spamtools] Sendmail access maps -- beyond access.db MIME-Version: 1.0 Content-Type: Text/PLAIN; charset="US-ASCII" In-Reply-To: <1698323961.20030911182251@conti.nu> References: <1698323961.20030911182251@conti.nu> Precedence: list List-Help: (List Instructions) List-Unsubscribe: (Use this command to get off the list) List-Subscribe: (Use this command to join the list) List-Post: List-Owner: (Contact Person for Help) List-ID: Precedence: bulk Sender: owner-spamtools@lists.abuse.net Reply-To: spamtools@lists.abuse.net X-Loop: paul@city-fan.org X-Evolution-Source: pop://earlesfield+p-filtered%23mail.plus.net@gatekeeper.intra.city-fan.org/ Content-Transfer-Encoding: 8bit Kai asked: > Bruce Gingery wrote: > > A lookup in Scheck_mail (with delayed checks) or Local_check_mail > > (without delayed checks) could use a secondary access-style > > lookup using a generic (virtual-users) style map processing. > Can you elaborate on this ? I am a little confused about this concept. Not sure which you were confused about. The access database is a very coarse tool. It has built-ins for use in the distribution, but many additional finer granularities are possible. For example, by waiting until RCPT TO: time, you can check the MAIL FROM: in the light of the RCPT TO: information. But that is not directly supported by normal access.db content. Since access.db is already used for simple checks: Connected IP or octet mask Connected domain or parent domain MAIL FROM: domain MAIL FROM: user@domain RCPT TO: domain RCPT TO: user@domain friend/hater Offer TLS Require TLS Use TLS outgoing Require TLS outgoing Certificate matching etc. etc. to add complexity (e.g. does user Y@local accept mail from Z@example.com) it makes better sense to add a different map, than to further overload the tags even further, in access.db The other possible confusion, I'll deal with first... Normally, sendmail calls several top-level rulesets based upon what stage of the transfer negotiation is current -- if the ruleset exists. I'll ignore AUTH and STARTTLS testing, to simplify things: STANDARD check_relay - before any greeting NONE check_vrfy - after any VRFY, if they're enabled NONE check_expn - after any EXPN, if they're enabled STANDARD check_mail - after 220, HELO/EHLO, 250, and MAIL FROM: STANDARD check_rcpt - after each RCPT TO NONE check_eoh - after the CRLFCRLF marking end-of-headers NONE check_compat - after end-of-DATA phase with most of the info gathered from the above no longer available. The Local_check_* rules are invoked near the beginning of their equivalent distribution ruleset. By default the Local_* ones are empty, hence are effectively not even called. But, if present: STANDARD check_relay YOUR calls Local_check_relay STANDARD check_mail YOUR calls Local_check_mail STANDARD check_rcpt YOUR calls Local_check_rcpt When you invoke FEATURE(`delay_checks') many of those are shuffled, in order-logic, and no check_mail/check_relay rulesets are even created in the sendmail.cf. Note reversed order ... NONE check_relay - is not generated into sendmail.cf NONE check_mail - is not generated into sendmail.cf STANDARD check_rcpt STANDARD calls checkrcpt YOUR calls Local_check_rcpt STANDARD and calls checkmail YOUR calls Local_check_mail STANDARD and calls checkrelay YOUR calls Local_check_relay but, that also means that for non-delayed checking, you may create your OWN check_relay and check_mail, since the standard generated checks are generated without the underscore in the name. That gives the possibility of (in order): YOUR check_relay - at connect time YOUR check_mail - after 220, HELO/EHLO, 250, and MAIL FROM: STANDARD check_rcpt - after each RCPT TO: STANDARD calls checkrcpt YOUR calls Local_check_rcpt STANDARD and calls checkmail YOUR calls Local_check_mail STANDARD and calls checkrelay YOUR calls Local_check_relay hence FIVE places to check the three main negotiation parameters with your own specialty checks. In effect - you can have your cake (delayed checks) and eat it, too (do some checks NOT delayed). The differences in the Local_* rules, are almost negligible, though, with delayed checks. Things done in check_relay are absolutely by remote IP/domain. Nothing else is known. We haven't even greeted the caller with a 220 (hence could greet with something else, like a 554). We only have the connection details - his address, his domain if any, his port, our IP (we could be listening on more than one) our port (we SHOULD be listening on more than one, if we do both mail and submissions on the same daemon), and the name and flags for the mailer the connection is on. Things done in check_mail can check the above, as well as * HELO/EHLO * HELO/EHLO parameter * MAIL FROM: parameters as well as combinations of those. Some users should only be able to send from a local connection, or even background non-TCP connection. Root, various daemons, even postmaster should PROBABLY only be able to send from limited local connections. Others can be invalidated right here. If somebody's sending with a MAIL FROM: we don't want to add our own hostname to that in processing, unless it really is the local postmaster. Neither do we want to leave it as a bare address without domain. Things done in Local_check_rcpt, OR Local_check_mail OR Local_check_relay with delayed checks can check the above, plus * RCPT TO: parameters as well as combinations of ANY of those. That means that unless you care WHAT recipient is being addressed, things like non-valid HELO parameters can be checked in a ruleset check_mail, so long as you delay checks. If you do not delay_checks, then you must move that to the Local_check_mail ruleset to avoid colliding with the standard check_mail ruleset. So what kind of checks aren't usually done (but should be) even in various options? 1. Check the RCPT TO: address for outgoing mail -- has the mail been invalidated with an MX record in DNS of a dot? This can be added to ParseLocal, so long as you're not using a pseudo-domain handled locally but with an MX 0 . to the public. 2. Check the MAIL FROM: domain for incoming mail. Has the mail been invalidated with an MX record in DNS of a dot? 3. Was the HELO (or EHLO) parameter valid? RFC2821 is very much more restrictive about what's permitted than some past poor practices. There are some domains (such as those hosted by mail.com) which NEVER should appear bare in a HELO/EHLO. There are others never used by the domain themselves, but which have been exploited by mailware, such as compuserve.com. Finally, there are dial-up direct-to-MX spammers, such as the "airs*.*" personnel services, which seem to always use their own domain in HELO/EHLO and/or MAIL FROM:, but morph all over the place. There are also bogon [IPv4] addresses and bogon domains (never registered, often forged) to reject right at the HELO/EHLO - no matter who they caim to be sending mail from and to. If the host HELOs or EHLOs as your name, it would be risking a mail loop to accept anything from that host. Same if it HELOs or EHLOs with your [IPaddress]. Somebody's confused, and manual intervention is required! Rejecting everything on such a connection SHOULD attract that manual attention, if there was any legitimacy to begin with. If it's a tld that is not used in any root you subscribe to, then it's mail to reject. For most people, that includes the few gTLD and special-purpose TLDs, and the ccTLDs, and those only. For others, there are more. An unbracketed IPv4 address is NOT a legitimate "address literal", nor is an [IPv4] that doesn't match the actually connected client's address. There could be a name mismatch, according to RFC2821, or even a fully-qualified-domain-name that doesn't resolve, and it still be mail that shouldn't be rejected for a bad HELO/EHLO. 4, Some domains only send from certain servers. If you KNOW you have no dot-forward mail coming your way, then you can restrict MAIL FROM: various domains to their actual servers. This especially includes the huge ISPs and FreeMail providers, so often abused because of presumed anonymity by spammers. None of this needs to be as late as RCPT TO: time. There's plenty left for RCPT TO: time -- nonexistent users, those who want to ban all mail from an otherwise legitimate sender. Block lists that aren't applied to whitelisted accounts like postmaster@ or abuse@. Checking a submitting IP address or fully-qualified-domain client name, against a list exploder RCPT TO: ... many many things. OR The other possible confusion. Using a "virtual users" style map, or even one more complex, in addition to simple access.db processing allows for fall-thrus and blanket white/blacklisting based on sender, apart from other usages. Listing an IP address in the LHS of access.db, or a series of octets indicating an /8, or /16, or /24, is a broad brush. Similarly with a domainname. They can be qualified with Connect: or From: tags, but still they have no bearing on the recipient. Site policy is established. Either a recipient is declared as a "friend" or "hater" but that can only be checked with delayed checks -- and still isn't by-sender/by-recipient, but only yes-no by-recipient when ANY flaws are found with the sender. In its simplest extension, a separate map is created with : action The fangs MAY be needed for odd sender local-part or odd recipient local part. The action can be a token (like RELAY DELIVER OK BLOCK REFUSE) or a full ERRROR:d.s.n:"COD message" -- provided the rules are invoked at or after the right time to check for matches in that map. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * The WORST thing is to pass such decisions on to a local * * delivery agent like procmail, after having accepted * * delivery. Then you either confirm deliverability (even * * if mail is discarded), or bounce unnecessarily -- often * * to a forged sender. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * For highly complex checks, a MILTER is needed with sendmail. Sendmail itself is limited to only $1 thru $9 for positional parsed parameters in any give rule. It is POSSIBLE to do multi-line parsing with temporary storage macros to get more than 9, but that gets VERY bloated. Keeping it _somewhat_ neat ... parse first THEN use those stored values in your lookups: Krelayctl hash /etc/mail/relayctl Kstorage macro SLocal_check_rcpt R<$*> $1 # next line wraps twice R$* $: <$1> $(storage {tlocal} $@ FALSE $) $(storage {flocal} $@ FALSE $) $(storage {hlocal} $@ FALSE $) $(storage {fnull} $@ FALSE $) $(storage {hident} $@ <> $) R<$* @ $=w > $: <$1@$2> ${storage {tlocal} $@ TRUE $) # next line wraps R<$+ + $+ @ $+ > $: $(storage {tuser} $@ $1 $) $(storage {ttag} $@ $2 $) $(storage {tdom} $@ $3 $) R<$+ @ $+ > $: $(storage {tuser} $@ $1 $) ${storage {tdom} $@ $2 $) R$* $: <$&{mail_addr}> R<$*@$=w> $: <$1@$2> $(storage {flocal} $@ TRUE $) # next line wraps R<$+ + $+ @ $+ > $: $(storage {fuser} $@ $1 $) $(storage {ftag} $@ $2 $) $(storage {fdom} $@ $3 $) R<$+ @ $+ > $: $(storage {fuser} $@ $1 $) ${storage {fdom} $@ $2 $) R<> $: $(storage {fnull} TRUE $) R$* $: <$&client_name}> R<$=w> $: <$1> $(storage {hlocal} $@ TRUE $) R<[$-.$-.$-.$-]> $: $(storage {hname} $@ <> $) R<$+> $: $(storage {hname} $@ $1 $) R$* $: $&_ R$+@$* $: $(storage {hident} $@ $1 $) R$* $: <$&{rcpt_addr}> # # At this point # $&{tuser} contains either RCPT TO: local-part, or # RCPT TO: local part sans tag. # $&{ttag} contains RCPT TO: tag, if there was one # $&{tdom} contains domain-part from RCPT TO: # $&{tlocal} contains TRUE if we consider that domain local # $&{fnull} contains TRUE if it was MAIL FROM:<> null-sender # $&{fuser} contains MAIL FROM: local part, or # MAIL FROM: local part sans tag # $&{ftag} contains MAIL FROM: tag if any # $&{fdom} contains MAIL FROM: domain part # $&{flocal} contains TRUE if that domain part is considered local # $&{hname} contains the resolved name of the connected host # or <>, if rDNS failed. # $&{client_addr} contains the address of the connected host # $&r contains SMTP or ESMTP, if this is a TCP connection # $&s contains the HELO or EHLO parameter # $&f contains the raw MAIL FROM: address # $&g contains the processed MAIL FROM: address. # $&{client_resolve} contains OK TEMP or FAIL # $&{hident} contains the identd return (or <> if none) for client # # in addition to having the passed parameter restored for matching, # and any session authentication parameters (if active), or $&{dsn_*} # macro content if ESMTP and Delivery-Service-Notification was requested. # # so we can combine any of this to look up in a relayctl left-side # or even cross-match among it. That's not the most efficient, but # it is the most flexible. The only thing not taken into account # is actual content - which (without MILTER overrides to what # information is "current") cannot be cross-matched to the recipients. # - BUT - # Even more parsing is possible. Do we want to try to de-VERP the # sender address to see if it matches the recipient address? Or # to extract some kind of list name? Perhaps the $&{tuser}@$&{tdom} doesn't want any mail with an ident of hidden-user nor squid, nor CacheFlowServer, regardless of +tag in his E-Mail address, but postmaster will take it anyways? More matching and shifting is possible in subsequent rules... Perhaps extracting parent domains of the E-Mail addresses. Or lookups in an fdnsbl of the sender domain, HELO/EHLO param or resolved client domain. Maybe the tag is an expiring timestamp that needs to be checked against the $b date (when-client-connected) or $t current time() value, as well as being checked to see if someone "just made it up" to try to zap past an expiring tagged address. The VERPed sender for MIME-format digests of spamtools can be tested (reasonably) with a lookup in a map that contains list subscriptions: : OK with R$* $: $1 $| $(listsubs $1:<$fuser@$fdom> $) R$* $| OK $@ OK because it would find an entry for : with an OK - discarding the VERP info for the lookup. Not all VERP patterns are that easy. They should be. If such messages were forged, then merely add the IP address or domain name of the sending client as a 3rd parameter in the map's LHS and in the lookup.