Spamproofing the Archives

From: Sean 'Captain Napalm' Conner <spc_at_conman.org>
Date: Sat Dec 7 19:17:01 2002

Jeffrey Sharp wrote:
>
> Right. No database.
>
> The encryption need not be complex. A simple rotation or XOR scheme would be
> sufficient. The idea is that it is not worth it to the spammer to figure out
> the scheme used at this one little site, and it's something that
> general-purpose harvesting heuristics aren't likely to notice.

  Any reason for no database? It doesn't have to be a SQL based one---if
you are using Unix you probably have either the Berkeley DB or the GNU DB
library which is good for simple key/value pairs [1] and doesn't require
setting up a connection to a real database server. Make the key the MD5
hash and really, that's all you need. When a message comes in to be
archived, you just MD5 the sender's address, look it up in the db file and
if not there, add the record.

  Trying to encrypt the email address is doable, but if you do a simple
method like rotation or XOR it *will* be cracked. All a potential spammer
has to do is subscribe to the list and post a message. Once it's archived
he can then pick up his encrypted email address and reverse engineer the
scheme. *Will* a spammer do such a thing? Probably not, but there is that
possibility and it's fairly easy to crack simple encryption schemes when you
know the plain text.

  Using DES (or 3DES or any other block-encryption scheme) is much better
but that relies on the key being safe (a fair conclusion) but the key is
*still* recoverable since the plain text is (or could be) known, but
cracking DES this way may be more expensive than a spammer is willing to
spend, so this may be a viable method as well.

Eric Smith has been known to write:
>
> In practice, a lot of people have had trouble with this.
>
> Make damn sure that whatever CGI script you use defends against any
> characters in the email form being interpreted as any sort of
> metacharacters. Of course, you have to do this for *any* web forms,
> but for some reason email forms seem to be especially susceptible,
> possibly because a lot of CGI scripts don't do enough validation then
> just dump all the data into a "mail" command. This has vulnerabilities
> for both command argument processing by the shell, and by strange and
> wondrous things that happen inside Sendmail.

  I've done similar things and haven't had a problem, one I attribute to is
using C for all my CGI needs (odd, but true). But I've never *embeded* the
email address to send to as a hidden field in the webpage---it's either
stored in the program, or in a file the webserver can't see but the CGI
program can.

  Validating email address is a bit more difficult but the heuristics I've
used have been good enough---check to see that the TLD (the last part of the
address) is valid (either .com, .org, .edu, etc., or a valid country code)
and that there are at least two (three for country code based domains)
segments to the domain portion of the address, and that there's at least one
_at_ sign and no other oddities in the user portion (like a UUCP style address
in an attempt to relay). That alone will probably save you (or rather, your
mail server) quite a bit of work.
 
Eric Smith still:
>
> If I understand the original proposal, it was to replace email addresses
> on the web pages with links of the form
> http://site/cgi_form?id=encrypted_email_address
>
> The user would click the link, and the CGI script on the server would
> first verify that it thinks the user is a person, not a robot, then
> decrypt the email address and return the cleartext.

  I thought the idea was to have a link on the webpage:

        To: Classic Computers Mailing List
        From: <a href="/cgi/reply.cgi?id=user.id&subject=Stuff+about+the+PCjr">Sean Conner</a>
        Subject: Stuff about the PCjr

  The user would hit the link. The program would then display a form that
the user would fill out:

        <p>This will send a message to the user indicating that you wish to
        talk to them. Fill in your email address and an email will be sent
        to the person indicating you want to talk to them. They will then
        respond to you.</p>

        <p>This is to keep spammers from collecting their email address from
        this website. Sorry for the inconvienence.</p>

        <form method="post" action="/cgi/reply.cgi">
                <input type="hidden" name="id" value="user.id">
                <input type="hidden" name="subject" value="Stuff about the PCjr">
                Your Email address: <input type="text name="from">
                <submit value="Send Notification">
        </form>

  Then the program will collect the three fields and send an email to the
person inquestion, setting the from line from the form (so I, in this
example, can just hit reply), the subject line from the email archive to me
(my email address either decrypted from the user.id or looked up if stored
somewhere) with a simple message like:

        To: Sean Conner <myemailaddress_at_example.org>
        From: Fred Smith <fred_at_example.net>
        Subject: [CC Archive] Stuff about the PCjr

        Some user with the address fred_at_example.net wants to talk to you
        about "Stuff about the PCjr". Reply to this message to start
        talking to this person.

Steve Jones then pontificated:
>
> I'd also submit any message via a TCP connection rather than
> invoking anything from the script, e.g. `sendmail -bs`. I can
> give you a simple example using Perl if you need it.

  I actually do the following:

  sprintf(cmd,SENDMAIL " -f%s %s",email,mailto);
  fp = popen(cmd,"w");
  if (fp == NULL)
  {
    sorry1(3);
  }
  
  fprintf(
           fp,
           "From: %s\n"
           "To: %s\n"
           "Subject: %s\n"
           "\n",
           email,
           mailto,
           subject
         );

  fprinf(fp,message);
  pclose(fp);

  And my email sending CGI is trusted by sendmail to use the "-f" flag
(sorry for the C---I do all my CGI work in C).

  -spc (Done more than my fair share of CGI programming ... )

[1] I use it to store email addresses for people who want
        email notification when I update my blog for instance.
Received on Sat Dec 07 2002 - 19:17:01 GMT

This archive was generated by hypermail 2.3.0 : Fri Oct 10 2014 - 23:34:40 BST