Mail::SpamAssassin::Plugin::FANN

A SpamAssassin plugin that uses FANN (Fast Artificial Neural Network) to classify email as spam or ham.

The plugin builds TF-IDF feature vectors from message text (body, subject, from name, attachment filenames) and optionally includes binary SpamAssassin rule hit features. A trained FANN model produces a prediction score between 0 (ham) and 1 (spam), which is matched against configurable threshold ranges to fire rules.

Installation

perl Makefile.PL
make
make test
make install

Prerequisites

Perl 5.14+
Mail::SpamAssassin 4.0+
AI::FANN

Configuration

Add to your SpamAssassin config (e.g. local.cf):

loadplugin Mail::SpamAssassin::Plugin::FANN

fann_data_dir       /var/lib/spamassassin/fann

body    FANN_SPAM   eval:check_fann(0.50, 1.00)
score   FANN_SPAM   2.0

body    FANN_HAM    eval:check_fann(0.00, 0.50)
score   FANN_HAM    -2.0

The check_fann(low, high) eval rule fires when the prediction falls within [low, high]. You can define as many rules as you like with different threshold ranges:

body    FANN_SPAM_HI    eval:check_fann(0.85, 1.00)
score   FANN_SPAM_HI    3.0

body    FANN_SPAM_LO    eval:check_fann(0.50, 0.85)
score   FANN_SPAM_LO    1.5

body    FANN_HAM_HI     eval:check_fann(0.00, 0.15)
score   FANN_HAM_HI     -3.0

body    FANN_HAM_LO     eval:check_fann(0.15, 0.50)
score   FANN_HAM_LO     -1.5

Configuration Settings

Setting	Default	Description
`fann_data_dir`	(none)	Directory for model and vocabulary files
`fann_min_word_len`	4	Minimum token length
`fann_max_word_len`	24	Maximum token length
`fann_vocab_cap`	10000	Max vocabulary terms (ranked by chi-squared)
`fann_min_spam_count`	100	Min spam messages required to enable prediction
`fann_min_ham_count`	100	Min ham messages required to enable prediction
`fann_learning_rate`	0.1	FANN learning rate
`fann_momentum`	0.1	FANN training momentum
`fann_train_epochs`	50	Number of training epochs
`fann_train_algorithm`	FANN_TRAIN_RPROP	Training algorithm
`fann_stopwords`	(none)	Space-separated stopwords to ignore (can be specified multiple times)

Training

Train a model using sa-fann-train:

sa-fann-train --spam /path/to/spam --ham /path/to/ham

Messages should be one file per message. The tool:

Tokenizes each message body, subject, from name, and attachment filenames
Builds a vocabulary ranked by chi-squared informativeness
Optionally runs each message through SpamAssassin to collect rule hits as binary features
Trains a FANN neural network and saves the model to fann_data_dir

Training Options

--spam path         Directory containing spam messages (required)
--ham path          Directory containing ham messages (required)
--epochs N          Override fann_train_epochs
--vocab-cap N       Override fann_vocab_cap
--no-rules          Text-only model (skip rule feature collection)
--hidden N          Number of hidden neurons (default: sqrt of input size)
--progress          Show progress counter
-C path             SpamAssassin config directory
--siteconfigpath    Site config directory
-p file             User preferences file
-L                  Local tests only (no network tests)
-D [areas]          Debug output

Utilities

sa-fann-dump

Inspect the trained vocabulary:

sa-fann-dump --path /var/lib/spamassassin/fann/vocabulary.data
sa-fann-dump --prefix body:    # show only body tokens
sa-fann-dump --rules           # show rule features

fann-split-corpus

Split a corpus into training and test sets:

fann-split-corpus --seed 42 /path/to/spam /path/to/ham /path/to/output

fann-test-accuracy

Evaluate model accuracy against test sets:

fann-test-accuracy /path/to/test/spam /path/to/test/ham
fann-test-accuracy -m /path/to/test/spam /path/to/test/ham  # show misclassified

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
bin		bin
lib/Mail/SpamAssassin/Plugin		lib/Mail/SpamAssassin/Plugin
t		t
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
MANIFEST		MANIFEST
Makefile.PL		Makefile.PL
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mail::SpamAssassin::Plugin::FANN

Installation

Prerequisites

Configuration

Configuration Settings

Training

Training Options

Utilities

sa-fann-dump

fann-split-corpus

fann-test-accuracy

License

About

Uh oh!

Releases 18

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mail::SpamAssassin::Plugin::FANN

Installation

Prerequisites

Configuration

Configuration Settings

Training

Training Options

Utilities

sa-fann-dump

fann-split-corpus

fann-test-accuracy

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 18

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages