« Failure to communicate | Main | Categorizing with spaces »

filster: Linking reputations networks to email whitelists.

I've written a procmail filter that checks incoming mail against several identity networks; when the sender's email address is listed, it adds a new header: X-Reputation: friend. Currently, plugins are provided for Orkut, FOAFweb, Reputation Research Network, and CPAN.

# When someone's listed as a friend, add X-Reputation: friend.
:0 f
|/usr/local/sbin/filster.pl

An addition to SpamAssassin's local.cf allows mail from these senders to pass through more easily, while allowing super-spam (scoring 20+) to remain blocked.

header REPUTATION_FRIEND X-Reputation =~ /friend/
score REPUTATION_FRIEND -7.0

header REPUTATION_PEER X-Reputation =~ /peer/
score REPUTATION_PEER -5.0

Combined with SPF, it becomes quite feasible to tie one's social networking profiles into a list of all the email addresses from which you don't receive spam.

The code is included in the extended entry of this post; please be aware that it is to be considered pre-alpha 0.0.1 pencil draft code. That said, it works quite nicely once you've got the prerequisite perl modules installed.

Update: Added code to link into the Reputations Research Network. Neat :)

Update: Linked into the CPAN author database and an instance of the FOAFweb as well.

First, filster.pl. This is the part that adds the header, if the user's found. It can easily be adjusted to use other tools than Orkut, as well. Please note the hard-coded pathnames.

#!/usr/local/bin/perl

use Mail::Address;
use Email::Simple;

local $/; my $content = <>;
$content =~ s/^X-Reputation: [^\n]+$//g;
my $mail = Email::Simple->new($content);

my $address = (Mail::Address->parse($mail->header('From')))[-1];
my $email = $address->address;

next unless length $email;
my @relationships;

`/etc/mail/filster/orkut.pl $email`; if ($?) {
	push @relationships, "friend (orkut)";
}

`/etc/mail/filster/foaf.pl $email`; if ($?) {
	push @relationships, "peer (foafweb)";
}

`/etc/mail/filster/rrn.pl $email`; if ($?) {
	push @relationships, "peer (reputations research network)";
}

`/etc/mail/filster/cpan.pl $email`; if ($?) {
	push @relationships, "peer (CPAN)";
}

$mail->header_set('X-Relationships', @relationships);

print $mail->as_string;

exit 0;

Second, orkut.pl. This is a vaguely generic script, developed around the WWW::Mechanize framework. Quite handy stuff. I will happily accept any code submissions for other networking sites and post them here. You'll have to call it with "--update-cache" to re-download the data, as there's no API at Orkut. As usual, sketchy code ahead: it's all fine and stable, but may not be up to the usual standards.

#!/usr/local/bin/perl

use WWW::Mechanize;

use constant USERNAME => 'username goes here';
use constant PASSWORD => 'password goes here';

use constant FETCH => 0;
use constant DATA_ORKUT => '/etc/mail/filster/data/orkut';
use constant USERS_ORKUT => '/etc/mail/filster/users/orkut';
use constant PARSE => 0;

my $args = join ' ', @ARGV;
my $agent = WWW::Mechanize->new( );

my $content; if (! -r &DATA_ORKUT || &FETCH || $args =~ /fetch/) {
    $agent->get('http://www.orkut.com/Friends.aspx');
    my $timeout; while ($agent->content =~ /PwdForgot.aspx/) {
	if (++$timeout > 1) { $timeout = -1; last }
	$agent->submit_form(
	    form_name => 'f',
	    fields => { u => &USERNAME, p => &PASSWORD },
	    button => 'Submit',
	);
    }; die "Login failed" if $agent->content =~ /PwdForgot.aspx/;

    $agent->follow_link( url_regex => qr{Friends\.aspx$} ) or die "Friends not found";

    $content = $agent->content;

    use Storable qw(lock_nstore); lock_nstore [ $content ], &DATA_ORKUT;
} else {
    use Storable qw(lock_retrieve); $content = @{ lock_retrieve &DATA_ORKUT }[0];
}

my %users; if (! -r &USERS_ORKUT || &PARSE || &FETCH || $args =~ /parse/) {
    my @content;
    $content =~ s{}{|}g;
    @content = split /\n+/, $content;
    @content = grep { /Profile\.aspx\?uid=\d+"/ } @content;
    my %users; for my $user (@content) {
	chomp $user;
	my @user = split /\|/, $user;
	my $uid = ($user =~ /Profile\.aspx\?uid=(\d+)/)[0];
	my $email = (grep /\@[^\s]+\.[a-z]{2,4}$/, @user)[-1];
	if (length $uid && length $email) {
	    $users{$email} = $uid;
	} else {
	    warn "Data failure ($uid, $email)";
	}
    }
    use Storable qw(lock_nstore); lock_nstore \%users, &USERS_ORKUT;
} else {
    use Storable qw(lock_retrieve); %users = %{ lock_retrieve &USERS_ORKUT };
}

exit 1 if exists $users{$ARGV[0]};
exit 0;

And so on. Here's rrn.pl, for the reputation research folk.

#!/usr/bin/perl

use constant FETCH => 0;

exit &rrn($ARGV[0]) ? 1 : 0;

use Storable qw(lock_nstore lock_retrieve);

sub rrn ($) {
	use LWP::Simple qw(get);
	my %users; eval {
	    %users = %{ lock_retrieve '/etc/mail/filster/data/rrn' };
	}; (&FETCH || $args =~ /fetch/ || length $@) && do {
	    %users = ();
	    my $content = get('http://databases.si.umich.edu/reputations/dir/directoryM.cfm');
	    %users = map { $_ => 1 } ($content =~ /"mailto:([^"]{5,200})"/g);
	    lock_nstore \%users, '/etc/mail/filster/data/rrn';
	};
	return $users{$_[0]} if exists $users{$_[0]};
	return 1 if $content =~ /"mailto:$_[0]"/;
	return $users{$_[0]} = 0;
}

This script checks the sender against the CPAN author database, tagging them as a peer if listed.

#!/usr/local/bin/perl

close(STDOUT); close(STDERR);

exit 0 unless length $ARGV[0];

use CPANPLUS::Backend; my $cp = new CPANPLUS::Backend;

exit 1 if grep { /^$ARGV[0]$/ } map { eval { $_->email } } values %{$cp->author_tree};
exit 0;

This is the configuration file for querying an instance of the foafweb. It's commented out by default to encourage some thought before it is enabled.

#!/usr/bin/perl

exit &foaf($ARGV[0]) ? 1 : 0;

use Storable qw(lock_nstore lock_retrieve);

sub foaf ($) {
	use LWP::Simple qw(get);
	my %users = %{ lock_retrieve '/etc/mail/filster/data/foaf' };
	return $users{$_[0]} if exists $users{$_[0]};
	my $content = get('http://eikeon.com/foaf/?mbox=mailto%3A' . $_[0]);
	$users{$_[0]} = 1 if $content =~ /<h2>[^<]+<a /i;
	$users{$_[0]} = 0;
	lock_nstore \%users, '/etc/mail/filster/data/foaf';
	return $users{$_[0]};
}

Comments

Of course, a spammer could then use your Orkut list to forge spam from your friends... Neat idea, though.

Add the following procmail header check to ensure that the mail has been validated by SPF, and you don't have to worry about spammers forging anymore; once it passes SPF, you can guarantee it's from them.

* ^Received-SPF: pass.*
You could also tie a check of the S/MIME signature attached, if any, to see if the key matches with the sender's email address.

Nifty! I might see about getting some of those ideas into SpamAssassin -- I've been thinking along those lines myself, too ;)

Have you checked the terms of service of those sites? IIRC, Orkut in particular has nasty anti-scraping terms.

Click2friends.com is an online social network that allows you to date, create small groups and communities that are secure through the people you know and trust which are your friends and family

I use an open source software SpamPal. It use DBMS servers to determine the bad proxies etc. It works fine for me!

The comments to this entry are closed.

My Photo

Categories

  • Activism
  • Essays
  • Lazyweb
  • Politics
  • Science
  • Tutorials
  • Weblogs

Recent Posts

Powered by TypePad

Locals

Legal

Metadata

  • Antispam
  • Cloudmark
  • Shadows
  • Styles
  • You were here
  • floating atoll

Google

  • Search


    Google

  • Ads